% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/generate.R
\name{filterGraph}
\alias{filterGraph}
\title{Filter graph to remove vertices that are not well connected}
\usage{
filterGraph(df, minAny = 11L, minDifferent = 2L)
}
\arguments{
\item{df}{a data frame with pairs of vertices given in columns \code{pa1} and \code{pa2}, and item response data in other columns}

\item{minAny}{the minimum number of edges}

\item{minDifferent}{the minimum number of vertices}
}
\value{
The same graph excluding some
  vertices.
}
\description{
Vertices not part of the largest connected component are excluded (Hopcroft & Tarjan, 1973).
Vertices that have fewer than \code{minAny} edges and are not
connected to \code{minDifferent} or more different vertices are
excluded. For example, vertex \sQuote{a} connected to vertices
\sQuote{b} and \sQuote{c} will be include so long as these vertices
are part of the largest connected component.
}
\details{
Given that \code{minDifferent} defaults to 2,
if activity \eqn{A} was compared to at least
two other activities, \eqn{B} and \eqn{C}, then \eqn{A} is retained.
The rationale is that,
although little may be learned about \eqn{A},
there may be a transitive relationship,
such as \eqn{B < A < C}, by which the model can infer that \eqn{B < C}.
Therefore, per-activity sample size is less of a concern
when the graph is densely connected.

A young novice asked the wise master, "Why is 11 the default \code{minAny} instead of 10?"
The master answered, "Because 11 is a prime number."
}
\examples{
df <- filterGraph(phyActFlowPropensity[,c(paste0('pa',1:2),'predict')])
head(df)

}
\references{
Hopcroft, J., & Tarjan, R. (1973). Algorithm 447: Efficient algorithms for graph
manipulation. \emph{Communications of the ACM, 16}(6), 372–378.
doi:10.1145/362248.362272
}
