Core Dump

Skip to end of metadata
Go to start of metadata

Click to View Core Dump

A project by Fiona CondonVictoria Le, and Sylvia Tomayko-Peters

The Idea: 

In creating "Core Dump," we were interested in applying the concept of composting to digital material: turning waste into potentially reusable matter. We conceived of a website that would take documents considered, by the user, to be "trash" or "waste," and process them, along with the documents submitted by other users, into new combinations of words and phrases - the result is language compost both on the level of the sentence and the word. We began with the idea of mashing up language using a Markov model, a statistical model which analyzes language that a user inputs and then spits back out newly combined sentences based on the probability of particular grammatical structures and vocabulary in the original text. This process seemed like a very fruitful site for exploration of the intersection between language and digital media. This idea morphed into an idea exchange bank, whereby a user would upload text which was considered trash by them and would receive in return the text that someone else had uploaded by the same logic which would hopefully be interesting or inspiring to them instead. This was an exploration of the idea that "one man's trash is another man's treasure."

"Core Dump" evolved from these two ideas, a combination of mashed up documents which would hopefully be amusing and ideally be inspiring to those interested in language. The choice of the actual name "Core Dump" derived from a desire to find language common to both the computing and composting world. A dump can mean both "a site for depositing garbage," (which along with the word "core" invokes the idea of an apple core, post-user waste) and a "core dump" or "memory dump" in computing is "the recorded state of the working memory of a computer program at a specific time, generally when the program has terminated abnormally or crashed."

In addition to the website built as a composting platform, we put up posters around the Brown campus in an attempt to increase the number of users who would deposit their digital waste with us. These posters featured the phrases "Compost?" or "Want to Compost on Campus?" without clarifying the digital nature of our composting project. This ambiguity is meant to play off the hyper presence of contemporary "green" campaigns and poke fun at the idea that the trash on your computer is just as vital to compost for the health of our planet as your household trash. In fact, the ambiguity of our posters proved effective. A friend of one of the project members, who is a part of Brown's own composting initiative, SCRAP, saw the poster and was taken aback that someone else was, unbeknownst to him, also attempting to mobilize the campus to compost. When he scanned the QR code on the poster he soon realized his mistake, but did in fact proceed to compost a document to help out "Core Dump."


The Coding Process:

The fundamental technical principle behind the project is that an existing body of text can be manipulated by a simple, stateless algorithm which output text "statistically likely" to conform to rules of English sentence composition. That algorithm, which we did not invent, is called a Markov chain, which essentially traverses a body of text associating strings of a certain length with a list of all the characters that come after that string. When it's time to choose a new character to add to the output, the algorithm chooses at random from the characters known to follow the last short string. In our case, that string was four characters long. The longer the string, the more likely the text is to resemble real language, but the less likely it is to branch into another part of the text. We found a length of 4 to be appropriate for our purposes.

The initial coding process began with appropriation in the form of stealing chunks of (open source) code to implement the Markov algorithm, which is a kind of common knowledge and intellectual interest among certain communities on the web. The next step was to get that algorithm to pull from a database, rather than a simpler, web-based text input, and to define an interface for uploading files to the database for processing. We thought it would be interesting to expose the workings of the algorithm and give users a way to identify their own text by associating each string with a user-selected color. This was a little more challenging. We had to fundamentally re-work the algorithm to associate each character with the color stored in the database. The algorithmic backend that handles the text generation and database connections was implemented in PHP, while the code that handles the pacing of the "typing" look in the browser was written in JavaScript, but it was difficult to port the character-to-color association that we'd established in the PHP code to the JavaScript code, so instead we added HTML tags around the PHP output to define the color of each individual character. That means that every time a character appears to be "typed" on the screen, the JavaScript is really adding a whole HTML font tag with a color attribute to the HTML document.

It was important to us that the website be truly viable as tool/resource for potential users. It was therefore essential that the generated word strings not be totally indecipherable, but adaptable for use in other texts. Markov made this possible. While the algorithm doesn't create whole sentences that make grammatical or logical sense, individual groupings of words may hold intriguing possibilities (the algorithm is especially good at creating unusual portmanteaus out of two or more words). Even with the algorithm running, the user can highlight and copy sections of text to be pasted elsewhere and reused for whatever purpose.

So, the website functions based on two kinds of interaction with the user: the first interaction takes place when the user chooses to enter a document into the data pool; the second takes place between the user and the stream of text. The product of this project might be considered threefold: the "advertising" campaign, the website itself, and the hypothetical products created by the user with the generated text.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.