There’s a fair amount of overlaps between programmers and mathematicians. Both tend to be very logical thinkers. Both like manipulating abstract entities in their head, and are comfortable working with complex symbolic spaces. Moreover, mathematicians often tend to be good programmers and programmers often have a fairly strong mathematical background. However, the two camps do not necessarily see the world in quite the same ways, and programmers are quite capable of doing things that would make a mathematician grind their teeth in annoyance (and vice versa, admittedly).
Dimensions Are Simpler Than You Think
this means I have a defined two-dimensional array with three rows and four columns. I could easily extend this notion to talk about an array such as arr[3,4,5], which is a three-dimensional array with cells arranged in a rectangular prism, three items high by four items wide by five items deep, for 60 cells total. You could even go one step further and talk about a four-dimensional array with cells arranged in a rectangular tesseract (arr[3,4,5,6]), consisting of six such previous arrays going off in yet another dimension, for a total of 360 cells in the hyperprism.
Mathematicians do this all the time because it is easy enough to generalize the notion of an array into arbitrary many dimensions. Programmers, on the other hand, do not tend to do this quite so often, because while it is easy to generalize, it’s surprisingly difficult to think of operations done in four dimensions because, we, three-dimensional beings that we are, get hung up on breaking that fourth wall (or dimension).
At this point, mathematicians shrug their shoulders and say, “dimensions, watcha gonna do?”. This gets even more complex when a mathematician talks about something like arr[3.5, 2.32]. This makes no sense from the perspective of the programmer since the arguments for creating a matrix are always in their world, integers. In other words, in the P (or programmer) world, an array refers to a discrete grid of values. In the M (or mathematician) world, an array is in fact simply a function f(x,y,z,…) where each of these variables are real numbers, or if they really wanted to do a mind-warp, complex numbers. That’s right; mathematicians do consider expressions like:
to be perfectly legitimate.
Of course, at this point, the programmer throws up their hands and says, “but what does that mean!!??”. To understand this, it’s worth understanding what a vector is, and why it’s not just a pair of numbers. A vector, in the way it is usually taught, combines a value with a direction. The value part should be obvious (though it isn’t as obvious as you might expect) but the direction part needs to be fleshed out.
There’s a fundamental characteristic of the universe that we call a length. It is a measure that indicates some metric (circular definition alert here because a metric is a measure) for determining scale. In our particular universe, there are about ten of these lengths, of which three are familiar to us: width, height, and depth. What’s of note here is that these three measures represent the length in different directions, usually as represented by a right tetrahedron, where each length is the same. There exists one point from which only right angles emerge.
Cartesian Space and Minkowski Space
This view of the world has been around for a while, but it took a French mathematician named Rene Descartes to recognize that this created a coordinate system known ever since as a Cartesian system. The right angles are important because it turns out that you can express any vector as the sum of three vectors at right angles where the components are added together with no overlap. Put that another way, you can say that a vector in three dimensions corresponds to the line between two opposing vertices of a rectangular prism where the starting vertex has value (0,0,0) and the ending vertex has value (width, height, depth), which are considered multiples of unit vectors. This doesn’t mean that your coordinate system must have right-angled unit vectors, only that you get the decomposition of width, height and depth when you do.
Note that all three of these vectors are lengths. This is usually about when some wag says that time should also be considered a dimension, and time isn’t a length. As it turns out, they’re right – time isn’t a length. These are two fundamentally different things. However, velocity is length divided by time, and the one thing that was known even by the eighteenth century was that the speed of light, while fast, was not infinitely fast. Scottish mathematician James Clerke Maxwell created a set of equations that described the relationships that existed between the electric and magnetic fields (we’ll come back to this term) and in the process, created a close association between the speed of light (c), time (t) and the square root of -1 (also known as i , for imaginary number). Hermann Minkowski, Henri Poincare, and Hendrik Lorentz each contributed significant work in expanding Maxwell’s equations and teasing out the fact that you could treat time as a length, specifically by the equation.
l = ict
In day-to-day life, this dimension doesn’t affect our perceptions much, primarily because human perception is good only down to about 1/30 of a second. In that time, light moves roughly 10,000,000 meters (10,000 km, or about a quarter of the way around the earth). However, the TCUP camera, created at the Université du Québec in Canada, can take pictures at speeds of ten trillion frames per second. At that rate, light moves about 0.3 mm per frame, which can be interpreted as oscillations. A point on the surface of a spinning object still spans a distance, which is one of the interpretations of an imaginary number, according to Euler way back in the eighteenth century. Thus, we begin to get a physical definition of a dimension as a thing that has a length in some orthogonal way. Relativistic mechanics, in which velocities get close to the speed of light, follow sets of transformations (the Lorentz equations) that are extensions of Euclidian affine transformations such as translations, rotations, skews, and so forth, as was shown in the work of a certain German patent clerk named Albert Einstein, who was one of Minkowski’s students.
The remaining six (or seven) dimensions that physicists talk about are similar in that they can be described as lengths. Still, the scale of these lengths is so small that they generally do not manifest until you get into very high-energy physics (and are beyond the scope of this article). However, some key concepts come out of physics (and mathematics) that are very useful to programmers.
Strawberry Field Forever
Suppose you have a space defined by multiple dimensions. In that case, a mathematician may talk about a function that, for a given line, surface, or volume, maps each coordinate in that particular space to a given value. This mapping function is called a field, and it is a very useful concept indeed. A field is not a dimension – it’s a map. It says that for every point in the manifold (aka a line, surface, space, etc.) there is something associated with that point. Perhaps the most intuitive example of such a field is a heat map. Now note that temperature is not a dimension – it’s actually a fundamental property in its own right, indicating the amount of kinetic activity within a given manifold that’s being converted into heat. A weather map that shows the temperatures at given places on the map (such as the cities around the Puget Sound) is a perfect example of such a heat map in two dimensions, showing the temperature of the air. This is called a scalar field because the measured object has no associated length dimension.
On the other hand, you could get a measurement of the surface direction and windspeed of the air at various points on the map, creating a vector map in either two or three dimensions, depending upon whether you’re measuring uplifts and downdrafts as well. While these vectors are dimensions, the resulting dimensions of the field are distinct from the dimensions of the manifold.
Other examples of scalar fields might include income level aggregates by postal code, consumption of alcohol by county, or the number of votes a given candidate received by a legislative district. The same manifold, thus, may have many different associated fields, and the use of the manifold becomes a powerful way of attempting to find correlations between different fields. Put another way, a scalar field (or any field for that matter) can be considered a feature for machine learning, with feature sets being clusters of related features, typically tied into a given map.
Note here that while I’m using the term map in the mathematical sense, as the projection of a manifold to a corresponding field, maps in the geospatial sense are also maps in the mathematical sense. Indeed, most modern cartographic systems, while they still make use of a quasi-spherical latitude/longitude/altitude system (which is still three orthogonal lengths, btw), the typical approach is to represent a world as a tesselation, with each tile in that tesselation then acting as the domain for a map describing everything from altitude to heat maps to political results as the range. One advantage to such an approach is that it also turns a map into an array that can be processed very efficiently.
Phase Space Fandango
Why is this important to a programmer or data scientist? Beyond making it clear that there is a distinction between dimension and feature, the notion of fields can help simplify both organization of information and approaches to finding correlations. Most of data science is built around working with aggregates – this is what statistics is, when it comes right down to it. It is generally not possible to sample everyone or everything within a given cell of a map, so in most cases, you will be dealing with representative samples, which in turn can be thought of as a mean entity with a distribution cloud surrounding it to indicate how accurate the sample is against other members in the appropriate cell.
Correlations come from these samples and allow you to work with phase spaces, in which two or more variables are compared with one another across a common “ground”. Clustering is done by finding commonalities across different fields (though these are treated metaphorically as dimensions in the literature), and dimensional reduction comes down to eliminating features common to all fields being produced (such as a position on a map or manifold).
Big math words aside, what this means is fairly simple. You can think of any kind of geospatial map as the home of multiple fields, and you can use the association of that map as one important key to both discover how things evolve over time (as fields) and how they correlate with other property fields. This can be a critical part of any knowledge engineering process and can cut down significantly on developing useful models.
332 total views, 2 views today