Peter Jansen | PhD Candidate | Cognitive Science Laboratory


stuff about me
my name is peter, and i'm a bit strange, often blue-haired graduate student in the cognitive science laboratory at mcmaster university. my research interests are broadly in developmental knowledge representation, which studies how systems of knowledge representation (like us) develop, what sorts of things we can represent in our brains (like concepts and language), how we represent them, and how we can get computational systems (like neural networks) to do something similar.

i have interests in specific areas of computer science, physics, and math, which are all essential to understanding what's in a representation. my undergraduate degree was a bachelor of independent studies done at the university of waterloo, specializing in physics (astro/optical) and cognitive artificial intelligence, with an official cognitive science option. my supervisors in the cogscilab are Scott Watter and Karin Humphreys, who are joined by Alex Sevigny and Lee Brooks in trying to keep me out of as much trouble as i inflict upon them (i.e., they are on my supervisory committee.)

current research
my current research has three major focuses:
Developing self-organizing neural network models of language acquisition
The Chimaera architecture appears to have a particular synergy to knowledge representation and language modelling, with phenomena as diverse as taxonomic grouping, attenuable semantic priming, complex grammar acquisition, and ambiguity resolution occurring naturally as a consequence of the model’s low-level dynamics. As a recent project in linguistic grammar acquisition, Jansen, Watter and Humphreys (in preparation) developed a multilayer Chimaera network simulation that was able to self-organize a sample English grammar with 8 high-level part-of-speech tags (N, V, DET, ADJ, PREP, CONJ, ADV, AUX) through exposure to 20 sample grammatical sentence parses containing these parts-of-speech. This initial work on grammar acquisition can very naturally be extended to a much richer grammar, and I would like to develop large-scale simulations that derive the complex grammatical rules present in the natural language parses of the Brown Corpus, which contains approximately 80 part-of-speech tags. I have already developed and piloted a distributed computing toolbox for the Chimaera architecture, allowing large-scale simulations to be distributed across hundreds of processors simultaneously, and enabling models of word and grammar acquisition with diverse, rich representations to be derived from a large corpus in significantly less time than on a single workstation.

From a higher level, I would like to investigate how multilayer Chimaera networks can be used as building blocks for larger multi-module information processing architectures, much in the same way Ping Le et al.'s (2007) DevLex-II (and other models) connect separate representation or processing systems together to acquire or process different types of knowledge. In this case, individual multilayer Chimaera networks would acquire different types of representations – say, phonology, and sensorimotor semantic representations of concepts, whose activation maps can then be joined together and form the input for a higher-layer form of ‘lexicon’ convergence network. From here, this project would develop a method to infer part-of-speech from these high-dimensional semantic representations, likely through a process of dimensionality-reduction, which would then serve as input to the existing grammar acquisition model. This combined phonology-lemma-grammar model would recast existing separate self-organizing models for acquiring phonological and semantic information into a Chimaera, then amalgamate them with the existing grammar-acquisition Chimaera. This would provide an exciting working sketch of a complete self-organizing system capable of acquiring both a diverse knowledge of individual words, as well as a knowledge of how they may be grammatically combined. Participant learning data would then be compared to this amalgamated model, both incorporating empirical data currently modelled by Li et al. (2007), but also other psycholinguistic and categorization phenomena that emerge naturally as a result of the Chimaera's low-level network dynamics.

The Chimaera self-organizing neural network architecture
Over the past two years of my PhD studies I have developed and progressively refined the network dynamics of a neural network architecture that blends together the concepts of a self-organizing map, an activation map, and intralayer Hebbian learning over time. When we develop emergent artificial neural networks, we are generally interested in two specific properties: how able is the network to store or represent information, and how able is the network to process or compute with that information. Common architectures are generally able to capture one portion of this distinction particularly well -- for instance, a Kohonen Self-organizing Map (Kohonen, 1982) is particularly good at representation, and does not require a teacher, but (without extensions) is virtually unable to carry out processing with that information. Similarly, a network based on the backpropagation architecture (e.g. Rumelhart and McClelland, 1986; Elman, 1990) is particularly good at learning some types of processes, but it requires a teacher, and is not very good at representing information.

In terms of the representation/processing distinction, the Chimaera stores representations explicitly in the self-organizing map, while associative processes are developed through Hebbian learning over time. Where activation in a backpropagation network tends to be interpreted as transient representations, activation in the Chimaera at a given time represents the network’s current processing state – itself a representation used by superordinate layers to gain computational capabilities, but with a very different character than explicit declarative representations of sequence elements. As such, where (1) representation is explicit in a self-organizing map, and (2) processes are explicit and representation is transient activation in a backpropagation network, both representation and processes are explicit (that is, stored in the network’s weights) in a Chimaera, where the network’s current processing state is transient activation, and itself used as input in multilayer Chimaera networks. Using this processing state as input to superordinate layers allows complex ambiguity resolution to take place in sequence learning, and it is my hope that this style of computation in a neural system may eventually allow them to overcome their issues with reflectiveness.

Empirical studies of the sensory-modality-specific components of concepts
The question of what exactly makes up a "conceptual representation" is a very old question. Past theories have proposed the representational specification of a "concept" is either verbal, visual, or some form of "amodal" symbol representation akin to a cognitive programming language. A relatively recent theory of knowledge representation entitled "Perceptual Symbol Systems" was proposed by Barsalou in 1999, and suggests that our conceptual representations are modal rather than amodal. Specifically, the Perceptual Symbol Systems theory proposes that our conceptual representations contain modality-specific components (e.g. visual, auditory, tactile, taste, linguistic, etc.), and that these modality-specific components are activated when we make use of our concepts, producing conceptual simulations that both have a perceptual character and actually make use of the same neural systems involved in perceiving a given modality.

My current empirical work looks at developing techniques to measure the modality-specific components of concrete concepts (easily imagable concepts such as tree, desk, sky, etc.), and extending these techniques to determine if we see the same variance in the modality-specific components of abstract concepts (e.g. scholarship, solution, justice, etc.) as we do in those that are concrete.

Jansen, P., Watter, S., and Humphreys, K. R. (in preparation). Acquiring a basic knowledge of English grammar with a self-organizing Chimaera network.

Jansen, P., Fiacconi, C., and Gibson, L. (In Press). A computational vector-map model of neonate saccades: Modulating the externality effect through refraction periods. Vision Research.

Jansen, P. (in revision). Using multiple layers as a disambiguation mechanism in a dual-representational Chimaera SOM.

Jansen, P., and Watter, S. (2008). SayWhen: An automated method for high-accuracy speech onset detection. Behavior Research Methods, 40, 744-751. [SayWhen Website]

Jansen, P. (2004). Lexicography in an Interlingual Ontology. Canadian Undergraduate Journal of Cognitive Science, 3, 1-5. [pdf]

Jansen, P. (November, 2009). Chimaera neural networks for self-organizing grammar acquisition. Poster presented at the 50th Annual Meeting of the Psychonomic Society. Boston, MA. [pdf]

Jansen, P. (May, 2009). Multilayer Chimaera networks: Self-organizing neural networks for temporal sequence learning. Poster presented at the Shared Hierarchical Academic Research Computing Network (SHARCNET) Research Day 2009. Waterloo, ON. [pdf]

Jansen, P. (June, 2008) the Tricorder project: see what can't be seen. Poster presented at the 2008 McMaster Innovation Showcase. Hamilton, ON.

Jansen, P. (June, 2008) Chimaera Networks: Temporal self-organizing artificial neural networks for sequence learning. Poster presented at the 18th Annual Meeting of the Canadian Society for Brain, Behavior, and Cognitive Science (CSBBCS). London, ON. [pdf]

Jansen, P., Watter, S. (June, 2008) SayWhen: An automated method for high-accuracy speech onset detection. Poster presented at the 18th Annual Meeting of the Canadian Society for Brain, Behavior, and Cognitive Science (CSBBCS). London, ON. [pdf]

Jansen, P., Watter, S., and Humphreys, K. R. (June, 2010). Chimaera neural networks for self-organizing grammar acquisition. Talk presented at the 20th Annual Meeting of the Canadian Society for Brain, Behavior, and Cognitive Science (CSBBCS). Halifax, NS.
Hebb Student Award (Runner up) for best paper/presentation.

Jansen, P. (Summer, 2009). Introduction to coding experiments in Neuro-BS Presentation. 3-Session Hands-on Workshop delivered at McMaster University. [ Introduction to Presentation Workshop website ]

Jansen, P. (Fall, 2008). Chimaera Networks: Self-organizing neural networks for representations, processes, and temporal Sequences. Talk sponsored by the McMaster Psychology Coggie-talks.

Jansen, P. (Summer, 2007). A PIC microcontroller cluster. Talk sponsored by the Shared Hierarchical Academic Research Computing Network (SHARCNET) High Performance Computing Day.

other academic projects
I am generally interested in promoting self-organizing models of representation and computation in neural networks, and finding particularly good or intuitive ways to both talk about them, teach them, and visualize them. Included in this, I've rendered a few videos of my network visualizations (which you may have seen while I was presenting a Chimaera poster), including a large-scale self-organizing map with over a million neurons self-organizing to about 2000 random input colours [Youtube] (this took 128 processors about 6 hours to compute on SHARCNET). In this video you can see three distinct phases: an initial oscillatory phase where representations rapidly move around the nodes, a stabilization phase (about 50% of the way through) where the representations have found their general topology, and a phase of rapid differentiation (about 80% of the way through) where the representations quickly differentiate from general categories to specific instances (this is my favorite part ;) ). A four-dimensional association map of a Chimaera network self-organizing [Youtube] to a small subset of English grammar is also available, where one can watch the network acquire both representations of the individual sequence elements (the colours), as well as the associations and transitions between elements (the association data within each cell).

Often I help out in projects in and around the lab where software development is required. Some of the more interesting examples include writing the SpeakWrite Viewer for Debra Pollock's Masters thesis on handwriting errors, which allows one to view and playback electronically recorded handwriting data (much cooler than it sounds), and modifying StepMania to record data for April Lee's Dance Dance Revolution-inspired Honour's thesis.

There are also a number of past academic projects that I haven't yet written up, a few of which include 'Star-Cross : A trans-spectral edge extraction algorithm', and The Dynamic Self-Organizing Map: Growing hierarchal relationships in a connectionist system (my undergraduate thesis). (A side-effect of this research are some very beautiful pictures of network topologies)

art and music
i sometimes dabble in making neat things. here are some of my artsy stuffs that i've done with modelers (vue d'esprit) or genetic art programs (like MAGE), and a couple of photography [Flickr] attempts. and if you like electronic music, maybe you'll enjoy this.

other current projects

3D printing and the RepRap project
3D printing rapidly creates actual three-dimensional objects, just as two-dimensional printers create pictures and text on sheets of paper. The RepRap project has developed and progressively refines low-cost open-source 3D printers that create objects out of plastic, with the ultimate goal of being able to design a 3D printer that could itself print all the parts required to construct another printer (so that you can print one for your friend, and they can print one for another friend, and so forth). My dad and I have constructed our own 3-axis 3D printer, and we use it to print out stuff. In addition we're active members of the RepRap community, and I wrote the open-source gcode visualization tool [Youtube] for viewing RepRap or MakerBot toolpaths exported from Skeinforge, a 3D model-to-toolpath slicer.

Open Selective Laser Sintering (SLS) 3D printer project
Similar to above, I have been actively designing and constructing prototypes for an inexpensive 3D printer based on the idea of Selective Laser Sintering (SLS), instead of the Fused Deposition Modelling (FDM) used by the Reprap project. Moving to SLS has the advantage of allowing the creation of complex arbitrary geometries, with potentially much higher resolution, while potentially lowering overall cost and eliminating many of the issues surrounding the construction of durable and robust FDM extruders. This project is being developed openly, and many of the design files are available on Thingiverse, while the development process is detailed on the Reprap Builders Blog. The project has developed a really interesting and unique design philosophy for creating inexpensive objects out of single materials while designing for self-replication. The project has recieved a bunch of notable attention, particularly for the design of an entirely laser-cuttable Linear CNC axis for around $7 in parts, a dual z-axis build chamber system for around $20 in parts, and the design of a reciprocating laser cutter [slashdot] that allows a fairly modest laser to do useful laser cutting work.

the Tricorder project
I recently designed and constructed two models of handheld multi-sensor devices capable of measuring and visualizing various atmospheric, electromagnetic, and spatial phenomena. For more information, please contact me.

past projects
PIC Microcontroller Cluster
I have always been enchanted with computational problems that can be spread across more than one processor. The implementations, as the number of processors becomes large, generally have a beauty and elegance to them necessitated by the structure of the problem, structure of the computational topology, and the flow of information within that architecture defined by a given problem.
because they fascinate me,... and i've always wanted to build one myself! Learn more about my cluster of digital signal processing microcontrollers (dsPICs). This project was featured on Hack-a-day.

FPGA Connection Machine
As a beginning project to learn Verilog, I developed a for-fun single-chip implementation of the Connection Machine, my favorite supercomputer. The Connection Machine contained 65,536 single-bit processors connected in a 12-dimensional hypercube topology, and were fed SIMD instructions by an onboard microcontroller that was externally connected to a LISP machine. The Connection Machine had a beautiful architecture, and I'd love to see one someday. I think my implementation could fit about 256 single-bit processors into a Xilinx Spartan 3E starter kit (although I'm sure many more could be fit if the architecture was optimized), where the array of single-bit processors was controlled using a Picoblaze soft-core microprocessor modified to include special instructions for the processing array, and ran a program written in Picoblaze assembly to interface with an external host over RS232.

Ludumdare 48-hour Game Competition
The Ludumdare 48-hour game competition is a legendary competition that occurs once or twice a year where a complete game, including all programming, graphics, and sound, must be created on over one 48-hour period by one person. The theme is released right at the beginning of the competition, and often the results are both amazing and hilarious. My entries tend to be abstract and experiment in things I haven't tried before. Bulrushes for LD13 (Roads) is a pretty game in the artistic style of Okami (a PS2 title), where everything is drawn with a brush. The player creates paths of light to guide little fireflies by illuminating bulrushes, and when fireflies come close together, they excitedly mate and burst away, leaving a new firefly in their wake. Bulrushes received third place in the graphics category, and fourth or fifth in innovation. Paramecium for LD10 (Chain Reaction) is a monocellular love story, in which the protagonist (you) must collect food vacuoles in order to replicate. It was an experiment in generating abstract procedural graphics, and features artsy graphics, intuitive mouse control, and a looping ambient soundtrack. [ludumdare entries blog]

Karawachi and Vibeforce
While in undergraduate I was one of the programmers for an indie video game studio, Sherman3D, where I worked on Karawachi (an anime-based 3D space shooter) and the Vibeforce MMORPG project (another anime-based role playing game). While Karawachi was programmed entirely in-house, we had some massive partnerships for Vibeforce including the NetImmerse engine by NDL (used in AAA titles such as Morrowind), and the Butterfly.NET supercomputing grid. Unfortunately our team was small and due to politics Vibeforce never emerged from the prototype stage, but not before it was shown at E3. A year after Vibeforce disbanded I partially rewrote the Karawachi engine from scratch, porting it to DirectX 8, and adding a more 'gravity-well black-hole action!' theme to the gameplay [screenshots]. You can learn more about the history of Sherman3D on wikipedia.

While in undergraduate I also worked on creating an early prototype handheld multisensory device for various atmospheric and physical measurements, similar in function (and looks) to a "tricorder". A fairly cool looking prototype was developed [picture] with a few sensory capabilities, with a much more functional prototype in the works. While the original designs made use of a BasicX microcontroller connected to a custom-programmed (and somewhat modified) Game Boy Advance SP (using the BasicX as a low-level sensor interface, and the Game Boy for computational power and a graphical interface. A later prototype that was never completed was to replace the Game Boy SP with an Imsys SNAP Module (a very small linux system) coupled with a separate LCD display equivalent to that used by the Gameboy SP. A partially completed prototype using the SNAP board and a Gameboy Advance SP to display the data is pictured here. Please note that this describes a very early project, and not the current Tricorder project.

Large Prime Numbers
Number theory and topology are almost to mathematics what the philosophy of science is to science. More of an avid personal academic interest than a course of research, I find number theory a beautiful and artistic study that seems to give one glimpses of how the universe (and information) work on a fundamental level. I especially enjoy "reinventing the wheel" and deriving conjectures from raw data itself -- it's almost like an understanding of how Gauss and Euclid must have felt when they first made a discovery. While I equally enjoy many areas of number theory, I find that I often try to further my knowledge by understanding a pushing the computational and representational issues involved with verifying the primality of large numbers. I recently independently derived a (somewhat more efficient) algorithm for computing the modulus of large numbers modulo 2p-1 , which I am rather proud of (although after some quick searching I found this algorithm has been known for some time). At least it means I know what I'm doing!!

FPGA vision
A delightful and on-going side-project to develop a robotic platform for embodied developmental knowledge representation research, this project is a first step in developing a computationally efficient and adaptable visual system in a field programmable gate array connected to tiny cellphone cameras. Once it's complete, I hope to place it within a manoi humanoid robot or similar (when they're available!). Learn more!

The Algorithm, aka "The Only Algorithm You'll Ever Need*"
A silly and fun project that started off wondering if one could make headway on the P = NP? problem by brute-force checking all possible algorithms (up to some number of lines of code) for some extremely simple NP problem. The Algorithm -- the idea of generating and running all possible pieces of code given a finite set of instructions, variables, and constants -- is a really neat problem to think about. It also, of course, happens to probably be *the* most combinatorially explosive algorithm you could ever run, without repeating yourself -- you are, after all, generating all possible pieces of code that could run in a given computational space. If this makes you laugh or just intrigues you, here is some sample code, as well as some sample output for a trivial implementation of the problem that will likely finish executing in your lifetime. While this implements the "instructional" formulation of the algorithm -- that is, it generates all possible code to run on a given system -- it's very interesting to consider the "representational" formulation -- starting with an example input and output state, and finding some path of (possibly unknown, unusual, or minimal) operations that constitutes a fully working algorithm for that problem.
* as time approaches infinity.

contact me
words : jansenpa [at] mcmaster [d0t] ca
voices : (905) 525-9140 x 22853
people : Psychology Building (PC) 227