Skip to content

Siggraph Post-mortem, part 1

January 19, 2011
tags: ,

Yesterday was ACM Siggraph‘s deadline for technical papers. Since mid-november I’ve been working on two research projects here at the MIT Media Lab‘s Camera Culture group, eventually expecting to write papers about them during this period, known to us as “Siggraph Season”. One of the projects eventually took off and we were able to finish a nice paper in time, which we hope will be accepted and well received by the community.

Although I still can’t comment on the subject our research, here are some highlights about the process, and the eventual submission of a paper for Siggraph:

  • Good ideas need a lot of time to mature. The best projects, those that eventually get submitted as a paper, normally stay around for some years before getting in shape, some are even revisited ideas.
  • Research doesn’t come out of nothing. We need to put a LOT of effort into reading, understanding and addressing previous work, developing the new idea (from idea to concept, from concept to model, from model to working prototypes), and finally rendering a paper and supplementary material.
  • It’s also a teamwork. There is no way one can do it all alone. First, there’s simply too many things to be done. Second, people have different preferences. In this type of process, every skill can become handful, such as knowing how to use a soldering iron, design a new circuit board, create good audio recordings, etc.
  • It seems odd to spend the last two months working 12 to 14 hours every day (including weekends, christmas, new year’s eve and any other holiday) and end up with a simple prototype, 6 to 8 pages of dense text, a 4 to 5 minutes video and some representative images.
  • At the same time, it’s remarkable how much knowledge and information you can gather, process and absorb in two months, even if you’re applying your skills in new field, one that you’re not very familiar with.

Things that worked for us:

  • Teamwork. We got along quite well and there were no hustles inside the team. Each one had his own rhythm and also preferences, and the process of task assignment occurred naturally at most times.
  • File-sharing. We’ve used both version control (svn and git), and Dropbox, for different purposes and at different moments. For most of the basic work, Dropbox is quite handy, because of its ease of use and ubiquiitousity.
  • Shared writing and pair reviewing. We shared the writing responsibility among three team members, and, in the end, all sections that we initially planned for the paper were written. The reviewing tasks at the polishing stage was done in pairs, sometimes in trios. This was very successful, and I personally think the quality of the final text greatly benefit from this.
  • Early writing and external reviews. We started to write the papers (we didn’t know at the beginning which ones would make it) very early. Invited reviewers that read the early drafts and other collaborators that knew about the project provided valuable comments.

In the next post I’ll talk about what didn’t work quite so well.

Snow Storm

December 30, 2010
tags: ,

Well, some people may have heard about the snow storm that fell down at this part of the world some days ago, so here is what my experience with it was like.

First, I was working during most of the first night of the blizzard, so I couldn’t see the amount of snow that was accumulating outside, but at 1:00am it was time to go home, and this is what the street looked like:

snow street, Cambridge MA

This is "snow street", near MIT Media Lab. I had to cross it this way.

But since it had already snowed before (just not this much), there was already salt spread through the most important streets and avenues. Given that, some adventurers/drivers were still driving around. This taxi driver, for instance, wasn’t particularly slow.

Crazy TX

Fast taxi at 77 Mas Avenue (MIT main building).

The cold is not so challenging (if you’re well protected as I was), and the view is quite cool to somebody who’ve never seen the snow before (yeah, the freezer doesn’t count).

Snow square

My way home during the snow storm

The other news is that we now have a pet at our lab. It’s a mouse that’s very interested in the leftovers we throw away every night after dinner.

Xmas at MIT Media Lab

December 24, 2010
tags: ,

Ho ho ho, it’s Siggraph season and everybody is hard at work here at the Camera Culture group. But after all it is also christmas and we (sort of) also celebrate this “holiday” here. This is a report of our attempt to create a thematic decoration at our lab this noon.

First the crew put hands on to make the christmas lights and tree to work. In this picture we can see Abhijit Bendale hard at work with the help of Vitor Pamplona:

Abhijit and Vitor trying hard to decorate the lab

The result was ok, but not very “inspirational”, so we got the help of seasoned christmas-fellow Dr. Ramesh Raskar, head of the Camera Culture group:

Ramesh Raskar fixing the mess

The final result is much more pleasing to the eye now. Notice the detail of the “xmas gift” under the tree:

Merry christmas!

If you don’t believe I’m seriously saying our final design is amazing, see the face of these passers-by when they saw our stuff by chance while visiting the MIT Media Lab:

Passers-by amazed by our christmas tree and lights (and Mr. Snow-singer-man)

Merry christmas!

Feliz Natal!

Siggraph Project Accident

December 16, 2010

Is light a form of heat? Yeah yeah yeah, of course… Light is made of photon particles, electromagnetic energy, which is directly converted into other energy forms when absorbed…

But it’s also a bit odd when you burn an ant using only sunlight and a magnifier glass (everybody did this sometime as a young child). What about burning a light-mask without even using a lens?

Well, as part of my ultra-top-secret Siggraph project, I’ve been playing with a 600 lumens LED that is very bright indeed. But today something unexpected happened: the mask caught on fire during an experiment! Blame it on the LED:

Light and burnt mask

Burnt mask still in place, on top of a 600 lumens LED

The burnt mask

Detail of the mask after the burning episode

The 600 lumens LED

The vilain: a 600 lumens LED. I was using only one of the two available. Yeah, that's a heatsink.

Siggraph Season

December 1, 2010
tags: ,

It’s SIGGRAPH season now, which means I won’t be writing a lot of stuff for the next month and a half. However, I promisse to finish the writing and finally publish the following work-in-progress posts during this period:

  • Trip to California and visit to Google;
  • Comprehensive page (not post) about my thesis research (PNF Story – model and framework for story representation);
  • Funny stories about the MIT loading docks, and the things you can get for free when they start renewing the equipment – a.k.a throwing away the old stuff (TVs, displays, computers – new/working and rarities)

For your amusement, this is a picture of my office’s whiteboard today, after a discussion with prof. Ramesh Raskar about my current SIGGRAPH project:

Crazy algebra: can you find the "snake" matrix?

Crazy algebra: can you find the "snake" matrix?

Assignment #2 – Light Field Camera (part II)

October 21, 2010

Now it’s time to show the results with my own photos. I’ve setup a scene where a bath curtain hides some objects behind it’s green stripes. The results were very interesting, so I’ll start with a challenge. Here’s one of the original pictures I’ve taken for this experiment:

Can you read who's the author of the book?

Can you read who's the author of the book?

Unless you’re very familiar with this particular book (for those asking what book, it is around the top-right-most part of the photo, behind the curtain), I guarantee that’s almost impossible to guess. So let’s see the results from the composed photos:

Focusing at the curtain, the mystery persists...

Focusing at the curtain, the mystery persists...

Tada... Can you read now? Try clicking on the picture to see the larger version.

Tada... Can you read now? Try clicking on the picture to see the larger version.

This last one is focused on the wall on the back.

This last one is focused on the wall on the back.

The most difficult part was to align the photos after they were taken. I didn’t have time to build an automatic gantry, so I used a millimetric ruler and some hacks (double sided tape, paper as a slider, stable surface for the camera mount…) to have the correct distance between each photo.

The photo gallery at the end contains all the aligned photos (without any post-processing besides the basic shift pre-matching) and the resulted composites. All original photos were taken with the following setup:

  • Nikon D3000 (APS-C sensor)
  • Aperture f32
  • Focal length 55mm
  • Quality settings: JPG, fine, medium size

The source code that was used can be downloaded from here.

Assignment #2 – Light Field Camera (part I)

October 20, 2010

Extracted from Stanford’s Light Field and Computational Photography website:

“The light field, first described in Arun Gershun’s classic 1936 paper of the same name, is defined as radiance as a function of position and direction in regions of space free of occluders. In free space, the light field is a 4D function – scalar or vector depending on the exact definition employed. Light fields were introduced into computer graphics in 1996 by Marc Levoy and Pat Hanrahan. Their proposed application was image-based-rendering – computing new views of a scene from pre-existing views without the need for scene geometry… ”

The assignment consists of using a regular camera (or alternatively a 3D renderer) to capture a sample of the Light Field of a scene, and then (re)composing a particular setup (focus distance, and point of view), based on the captured images. The proposed experiment was to move the camera horizontally on small fixed steps, preferentially by using a robotic gantry – Lego Mindstorms is an option – to control this movement. I decided to postpone this to part II, starting my experiments with an image set from Stanford University and also the 3D rendered scene.

I used MatLab to combine the original photos with “shift and add” operations and refocus the images at different planes (orthogonal to the camera sensor). I was also able to create see-through effects with the 3D scene. I’d like to acknowledge Andy, Otkrist and Jessica, also from the Camera Culture Group, for some hints and the basic MatLab code from which I developed my own version.

For the first experiment, I used some sample light field images from the Stanford Database. I’m not including the original source images here because they can be obtained from the aforementioned URL. I used the rectified version of the Chess set and bellow are the results I obtained:

Chess board refocused at the first row.

Chess board refocused at the first row.

Chess board refocused at the middle row.

Chess board refocused at the middle row.

Chess board refocused at the last row.

Chess board refocused at the last row.

For the second experiment I used a virtual scene, rendered with the Unity3D game engine. I used a simple script to shift the virtual camera horizontally and record the rendered images. Bellow are the composed images showing a see-through effect. The camera is positioned in front of a grass “wall”, and there’s something behind it. Check the last image to find out what it is. This see-through effect is only possible because the original images where taken with a horizontal shift and the grass leaves form a (mostly) vertical pattern (on more complex patterns, such as bushes, I’d have to move the camera in both axis – horizontal and vertical). The original files can be seen and downloaded from the gallery at the end of the post (make sure you notice that the composed see-through image shows more details of the figure behind the grass than any of the original ones):

Focusing too close to the camera, nothing can be depicted from this one.

Focusing too close to the camera, nothing can be depicted from this one.

Focusing at the grass "wall", still not seeing what's hidden behind it

Focusing at the grass "wall", still not seeing what's hidden behind it

By focusing behind the grass "wall", it is possible to see the bird behind it.

By focusing behind the grass "wall", it is possible to see the bird behind it.

In the next post, I’ll show Part II of this assignment, detailing the results of the experiment with my own photos, and post the MatLab source code. Finally, here’s the gallery containing all the images (original and composed) from this Part I of Assignment #2.


October 16, 2010
tags: ,

Well, I won’t say much.

This week is the MIT Media Lab Sponsor Week, where we gather to show all the projects to the sponsors of the lab and the press. There are a lot of parallel events happening, and I had the immense luck to meet one of the most important persons in the history of artificial intelligence.



Marvin Minsky, me and Tom Chi


Well, this is Marvin Minsky, me and Tom Chi (from Google) having the most unusual and delightful conversation of my life.

How to run a public-oriented lab

September 24, 2010

MIT Media Lab is very unique in many ways, but one of the most noticeable is that we are constantly dealing with the public. Many companies that sponsor the lab request visits to see the latest advances in research that we do here.

At the same time, casual visitors passing by the lab are allowed to visit the public areas. It’s also amusing when we get out of the lab for lunch only to find a bunch of businessmen paying attention to a tourist guide explaining what this building is and so on. For instance, I feel like a tropical blue parrot in a Boston Zoo at those moments.

For this reason, the spare space at the lab must be organized in a museum-oriented fashion, with all finished projects on display, accompanied by posters and so on. But the most important part of these visits are always the individuals (professors, grad students, visiting students, post-docs and so on). Every group that comes to our lab wants to learn something that’s unique, and the best way we can provide that is by giving a live presentation.

It is not practical or fair to have an individual to give all these demos every time, given that they last for around 20 minutes, and this kind of interruption is enough to break anybody’s concentration. Add to that the fact that we have an average of 4 (four) important sponsor visits every week.

The solution to that is actually very simple: have EVERYONE in the lab ready and prepared for giving a demo, and use a round-robin style allocation process. Every student here must know how to give a presentation of the lab research (around 8 to 10 finished projects in each presentation), and the resources are provided by those that actually run each project in our internal wiki.

For each project that consists of:

  • 30 seconds, 3 minutes and 15 minutes abstracts;
  • A visual and informative poster-like A4 sheet (besides the published papers);
  • The actual prototype on display (must be working);
  • A ~12 slides overall presentation of the Camera Culture group and the projects;

Unfortunately, I’m not allowed to show pictures of the projects on display and the lab layout, but I hope that my explanation was enough to give an idea on how things work. In a couple of weeks, the most important demo-oriented event will take place here, the Sponsor Week, when around 350 companies will come to see what we have to show. I’ll have a lot to tell about that then.

Assignment #1 – playing with color channels

September 23, 2010

This is the result of the assignment #1 for Dr. Ramesh Raskar’s Computational Photography Course at MIT. The goal is to take pictures of a subject by using different light sources and combining them programmatically. For this assignment, we were encouraged to use our own cameras and the programming platform of our choice.

Equipment and tools

I used a Nikon D3000, with the original 18mm-55mm (F3.5-F5.6) lens set attached, and manually controlling all camera parameters. For the programming part, I chose to use Processing, an open source language for media visualization and manipulation.

Experiment description and setup

All photographs were taken with a fixed camera mount, ISO 200 sensitivity, focal length of 28mm, F4 aperture and 1/30s shutter speed. The room was almost completely dark, so I ignored ambient light for the chosen aperture and shutter speed values (some controversy and more comments on this later).

I tried to reproduce the results obtained by Paul Haeberli, also using two separate light sources around the subject, and taking a picture with each one of them turned on respectively. Taking advantage of the linearity of light, the two photographs can be combined to achieve a result similar to that of the reference picture below, taken with both the lights turned on:

Two lights turned on (reference)

Now, these are the individual photographs for each light source:

Left light source turned on

Right light source turned on


To simulate the reference picture (both light sources turned on), I added the values for each color channel of both the input images. The image below shows the result of this operation:

Simulated full light (sum of source images)

The brightness of the simulated full light image is a bit higher than the desired result, probably because of some post exposure processing at the camera for the dynamic range. This may also be an effect of ambient light being added twice, but I took control pictures without any of the lights, and all pixels were absolutely black with the camera parameters used for the photos.

To avoid this in future experiments, I’ll take RAW photographs straight from the camera, without any post-processing. However, the achieved result is still quite convincing as natural lighting.

Another option is to use color masks for each original image, simulating the effect of light sources of different colors. By doing this, very interesting light effects can be achieved, such as those shown by the next images:

Blue channel from left and Red channel from right images

Full Red channel plus some Blue and Green from right image only

Interactive version of the experiment

By using Processing, I chose to make the combination of color channels an interactive application. It wasn’t possible to publish the resulting applet here because of some security restrictions of my hosting website, but I posted it at my other website:

The full source code for the application can also be downloaded from the above link. The following code snippet is the important part, where I combine the channels based on the position of the mouse, relative to the app window:

for(int i=0; i < out.pixels.length; i++) {

 // original colors from the original pictures
 color c1 = img1.pixels[i];
 color c2 = img2.pixels[i];

 // calculating the amount of each color channel to use from each image
 float rightR = red(c2)*rightIntensity;
 float rightG = green(c2)*normalizedY*rightIntensity;
 float rightB = blue(c2)*normalizedY*rightIntensity;

 float leftR = red(c1)*normalizedY*leftIntensity;
 float leftG = green(c1)*normalizedY*leftIntensity;
 float leftB = blue(c1)*leftIntensity;

 // filling the output by adding the calculated color channels from both sides
 out.pixels[i] = color( leftR+rightR , leftG+rightG, leftB+rightB, 255);

At line 17, one can see that after applying the color mask for each side, the resulting pixel color is just the channel-wise sum of the left and right pixels. The full source code contains comments that are useful to understand the implementation details, and the original images can be directly downloaded from this post for your own reproducibility check.

Other fun experiments with color channels

The assignment consisted of using our own camera and experimenting with exposure parameters and different light setups, and then using a software (for non-engineers) or programming language to play with the color channels of the images acquired.

The first experiment I did was taking two pictures with different apertures, and experimenting with the blending of color channels and try to create simple glow effects. The source images are shown bellow:

Large aperture: 35mm F1.8

Small aperture: 35mm F8.0

Some of the results are shown below:

Blue glow

Purple back, Green detail

The kernel source code for the result achieved above is this:

for(int i=0; i < out.pixels.length; i++) {

 // original pixels from the pictures
 color c1 = img1.pixels[i];
 color c2 = img2.pixels[i];

 // interpolation of the red and blue channels, based on mouse location
 float r = map(mouseX, 0, width, red(c1), red(c2));
 float b = map(mouseY, 0, height, blue(c1), blue(c2));
 float g = map(mouseY, 0, height, green(c1), green(c2));

 out.pixels[i] = color( g , (r+b)/2, g, 255);

Life on Mars

Finally, I also used a picture I found on the internet to play with its color channels. I liked the result of using the green channel from the picture of a forest as the red channel, and diminishing the others. The result is shown bellow:


(Plant) Life on Mars?

I’ll keep posting my assignments here at the blog, as long as they are not part of a research project.