Suppose we assume that there is some probability density function (pdf for short) for the men and another for the women, but we are unwilling to give a commitment to gaussians or any other family of functions to represent them. About the weakest condition we might apply is that the two pdf's are continuous. We could try to estimate them locally, or at least the likelihood ratio, in a neighbourhood of the new datum. One way of doing this is to take a ball in the space centred on the new point. Now count the number of points in each category that are within the ball. The ratio of these two numbers is our estimate of the likelihood ratio. These are called Parzen estimates.
Of course, one number or the other might easily be zero if the ball is too small, but if the ball is too big it might measure only the total number of points in each category. Oh well, life wasn't meant to be easy.
An alternative is to take, for some positive integer k, the k nearest neighbours of the new point, and count those in each category. Again, the bigger number is the best guess. This is called the k nearest neighbours algorithm. It looks rather like the metric method with which we started the quest to get sensible answers to the pattern recognition problem, but is making different assumptions about the nature of the process producing the data. Again, there is a problem of how to measure the distances. In either alternative, it is possible to weight the count of points inversely by distance from the place of interest, so that remote points count less than close ones. This brings us back, yet again, to the question of what the right metric is.
Some people have argued that Artificial Neural Nets are just a non-parametric statistical method of making decisions: this is debatable but not profitably.