Los Angeles California USA
©2008 CreativeCOW.net. All rights reserved.
Dave Stump, ASC has worked as a DP, effects cinematographer and VFX supervisor on dozens of films including Quantum of Solace, X-men and X-men 2, The Bourne Identity, Army of Darkness, Star Trek: First Contact, Batman Forever and many more. He chairs the American Society of Cinemtagraphers subcommittees on Cameras, and Metadata. In this expanded version of an interview first published in Creative Cow Magazine, Dave describes a possible future for filmmaking -- faster, less expensive, and more creative -- as cameras and metadata come together.
Pictures and sound are data. Information about them is metadata — the data about the data.
Metadata can begin with information as simple as reel name, clip name, date, duration. However, with new cameras skipping video and film as we’ve known them and recording straight to digital files, the potential complexity of the metadata skyrockets.
This is why metadata collection is moving closer and closer to the beginning of image capture, to lenses, cameras, even cranes.
Dave Stump is the chair of the Camera subcommittee of the American Society of Cinematographers, and co-chair of the Metadata subcommittee. His message to Hollywood is how critical it is that camera issues and metadata issues be addressed at the same time.
Creative Cow’s Gary Adcock assists Dave on these two committees, and told us about a presentation that he and Dave gave during NAB 2008 to illustrate metadata. Dave held up a photograph, and asked if anyone in the audience could figure out who it is. After some guessing, someone in the audience suggested looking at the back of the photo to see if a name was written there.
Dave said, “Ah, you mean check the metadata.”
(On the back of the photograph is written "Earl Stump, 1918." It's Dave's grandfather.)
As his “day job” Dave has served as the visual effects director of photography and VFX supervisor for dozens of films, as diverse as “X-men” and “X2,” “Batman Forever,” “Stand by Me,” “Free Willy,” and 2008’s James Bond film, “Quantum of Solace.”
Regardless of a film’s scale or genre, Dave’s task is the same, enabling the realistic combination of camera footage with CGI. Until very recently, much of that work was done by hand, guided by informed guesswork, hoping to match camera position, lens length, focus and more— typically all of them in motion at once over the course of a shot.
In 2000, Dave was part of a team that received a Technical Achievement Award from the Academy of Motion Picture Arts and Sciences, for hand-development of advanced camera data capture systems, which he describes below. We’ve come a long way since then, in no small measure thanks to the concerted efforts of Dave and his colleagues.
As he tells it, his primary goal in that ongoing effort was simply to explain what metadata is, and why it matters.
Dave Stump: Privately, my secondary goal was to shame the proprietary sense of everyone in the manufacturing community who builds our tools. Because everyone who builds a machine, every one who builds a computer-driven device, everybody who uses metadata, builds their own metadata scheme, and no two of them talk to each other.
You know the saying, “Standards are great. That’s why we have so many of them.” If no two standards can talk to each other, there’s no uniformity to the metadata. It becomes meaningless.
Gary Adcock: It’s not only that they can’t talk to each other. Even when they do create and handle metadata, they don’t store it in the same place.The ASC is trying to maintain the integrity of the workflow.
Dave: That’s right.
What we end up making now, out of pictures, is data. Whether we are shooting on films or digital camera, the pictures end up as files. Nobody finishes the pictures on analog media anymore, nobody photochemically carries a movie all the way through really. You don't take a negative and cut it and print it and and time it and make an answer print. That's not the way you make movies anymore. That's not the way you're ever going to make movies again.
You put it on a scanner, you save it as data. Some of it you send out to a visual effects house, some of it, you run through an Avid or a Final Cut system. Whoever is working on it at one visual effects house puts it in a Shake system, or some of it goes in Maya. Some of it goes into an Autodesk Flame, some of it goes into Inferno, some of it goes into Matador.
All of these systems will bring in a DPX file or Cineon file. Sure we know that this is, but the data in it that would have told you when and where it was created, and who it belongs to, and how it was named in accordance to standards set for the particular movie, and what the original colorscape of it was, or what the original camera settings were, or how quickly it was panning from left to right in degrees or frames -- all of the information put in there is discarded the moment its fitted into another new machine. Thrown in the thrash.
“What do I need that for?," [someone asks.] "I’m just here to do some compositing.”
So when you get back a file, all that information has been decimated. And there is no reason why we can't all agree on the value of the data like that. And agree to do no harm to it.
Tim Wilson, Creative Cow: What are some of the specific kinds of things you want to preserve?
Dozens of things. For starters, preserve the naming convention of a particular movie or particular studio or a particular post house or particular vendor.
Naming is vastly more important than we think it is because that's how you find things. That's the first thing you look for in databases. "Go look for a file called 'The Buddy White Story' We started off naming the third and fourth characters in the names in the string as 'bw.'"
If the last place that you sent the file stripped the name and used their own naming convention, which is some UNIX string of number or random number or date that they used, the "bw" is gone. Now you can't find it with the computer!
So naming conventions are the first thing that's really, really important. But the kind of things that we expect to put on file of our pictures goes vastly vastly deeper than that. Have you for example read the menu structure of a Sonly F900 or a Panavison Genesis. The menu tree of the Genesis is, I don't know Gary what would say? Probably 100 different criteria.
Gary: Minimum. I think its closer to 200. With the Sony F23, it’s something like 262 items.
Dave: So yeah, 262 menu entries. The F23 is almost the same back end menu structure as the Genesis, so call it 260 fields of metadata that ought to be included in every picture the camera makes.
That's 260 fields of metadata that ought to be included in every picture that it makes. Just for starters. Just for that camera alone. And that doesn't even include the main menu criteria that also ought to be there. There are criteria ought to be there that Sony hasn't even though to put in yet.
The problem is that so few of the people who are part of this process have sat down and agreed on how the data ought to come out. Most of them want to build the machines where the data comes out themselves, and fit them into another proprietary box which you have to buy from them. So the monetary interest in being the only solution for metadata prevents the universalization of standards.
And, excuse me, that’s what standards mean! Something that’s open source and universal. When you say “our standard,” it’s no longer a standard.
Dave: What we are discovering now is the truth in what I proposed years ago, that you can make cameras that are smart enough to know what lens you are putting on. You plug a lens into the camera and little contacts in the back go here it is, Panavision lens, 15mm, Serial number 119 and here is its mustache curve, here is the distortion map for this lens, it is focused at 7 ft., stopped at f8, and so on.
Tim: I was struck by your earlier example of following the degrees of the angles that the camera moves through during a shot.
Dave: Yes, on a frame-by-frame basis. Because the visual effects people then have to take whatever pictures you’ve created at 24 or 29 or 120 frames per second, put them into a tracker, Boujou or PFTrack or who knows what, and solve for movements including dolly and tilt, focus, zoom, boom, swing, track and everything else that goes into a shot. It’s a horribly complicated equation to figure out after shooting.
Yet in the grand scheme of things, that’s a minuscule amount of data to collect while shooting. You only have to remember to ask, “O’Connor, the next time you build a pan head, we want it with a plug for a data recorder.” Or “Panavision, do you have a GPS set that you can build into the base plate?”
GPS apparently takes very little real estate because it’s there in my iPhone sitting on my desk.
Gary: I look at it from the post side. Cooke Optics has this little box, the “/i dataLink.”
Cooke Optical i /dataLink with the Arri 235 film camera
It records focus, zoom and all that from the lens, and then everything from the camera too. It records all that to this little SD card.
Now you have the actual data. Instead of having to recreate it, you can do motion matching and everything in VFX long before the footage itself actually gets there. There’s not somebody waiting for the footage, and then starting to do all this work manually for weeks and weeks on end.
Or conversely you could take VFX information of files that have already been created, and program existing queries from master shots or something that's already been approved. Everything gets more and more efficient down the line. You can streamline the cost and expenses and redos and everything else further down the food chain thereby saving money in the long run.
Dave: Exactly. This is the classic mistake that studio bean counters make. “We need to get the budget down, so let’s beat this guy up for more of his wages.”
Instead, for a shot that used to be a Boujou problem, you create a sync frame, like the bloop on the slate. Now comes the rest of the data: here’s the center shutter open pulse, here’s the pan, tilt, focus, zoom, f-stop, dolly, boom — synchronized with every frame of the film that you shot.
The artist who would have spent six weeks tracking this out by hand, and reverse engineering camera position and focal length anecdotally or from someone’s handwritten notes, can now simply take the metadata file, plug it in and start doing the work. The real work.
This is the way that I love to frame the discussion, as an invitation to the producers and the studios who want to save money. You know, we can all stand around and haggle over 50 cents an hour for every employee on the staff and you can feel like you’ve saved some money.
Or we can automate those people’s work, get it done in a week’s less time or a month’s less time, and then save some real money.
Everyone asks, well, who’s going to pay for developing all of this new automated metadata collection? I say, we already pay for it anyway. How often do you buy computers and cameras and lenses? We renew and replenish this stuff on a daily basis. At least ask manufacturers for what you want in the updates, rather than just taking what you’re handed.
SIDEBAR: QUANTUM OF SOLACE
In one sequence, James Bond and Camille (Daniel Craig and Olga Kurylenko) are tossed from an airplane with only one parachute between them. A typical approach to this in the past would have been to use stunt people for the most difficult sequences, and intercut with greenscreen footage. The problem is that it looks a lot like a combination of stunt people, with two flat layers composited together for the inserts -- each of which could only last for a second or two before the "seams" would show.
The filmmakers wanted something much more realistic: "real" bodies falling in "real space," allowing the camera to move around them, and holding the shots for as long as it took to tell the story.
To create the illusion, the actors and their doubles were trained to free fall inside an ex-military vertical wind tunnel, six stories tall with a wind machine blowing at 150 MPH. “We took out all the windows and and some of the walls and painted it white to suit our purposes,” says Dave, “and we strung lights everywhere — in the bottom, all around the walls. We put in 8 Dalsa Origin 4K cameras and 7 Sony F900Rs, all of them locked in place. We also had an Arriflex 435, which was mounted on a Steadicam and flown in freefall alongside the actors.
"Not only that, this wind tunnel amounts to a metal thermal bottle. So when you when you start stringing high frequency, high energy digital image cables around the same structure, you quickly discover that with all of that wind rushing through it at 150 miles an hour it becomes a cross between a transformer and a Ver Der Graf generator. There were tense moments.
"But aside from one bad BNC cable that started causing me trouble, the next biggest problem I had was, at the point half way through the day, a fan in the top a the machine blew a seal and started dripping oil. So we had to stop for an hour while they changed that.
"For everything else, we got everything that they wanted to get shot, and it was an truly an astounding sight to behold. The physics of watching Daniel and Olga in this wind tunnel was absolutely spot on for two human bodies falling from an airplane.
“The heart of the challenge was to synchronize all of those cameras, so that running with 90 degree shutters, they all have the same effective center shutter opening interval. And it had to be very, very precise.
"What we were attempting to do with the cameras was to create a data cloud. That is, we were creating a point cloud of metadata. We knew the focal length and the characteristics of every lens. We set them more or less locked off for the spot inside the tunnel inside the tunnel where we wanted the actors to float. I built the stop deep enough so that we wouldn't have to rack focus. And we solved for every pixel from every camera for its position in space throughout the entire synchronized shot.
"A double negative was taken of that data and solved for the position of the actors, who were then regenerated as CGI characters and inserted into real aerial photographic backgrounds from the film’s locations.
“It’s pretty astounding.”
Daniel Craig and Olga Kurylenko star in Metro-Goldwyn-Mayer Pictures/Columbia Pictures/EON Productions’ action adventure QUANTUM OF SOLACE. Photo credit: Karen Ballard. © 2008 Danjaq, LLC, United Artists Corporation, Columbia Pictures Industries, Inc. All Rights Reserved.
Tim: I'd like to get into some specifics. I know you started working with Viper in their early on days, right?
Dave:: Yes. And when it was introduced, there was one recorder, called the Director's Friend, which sort of disappeared immediately, and there was nothing to record it to.
Tim: So what happened?
Dave: So I took the camera and made a picture with it. And I made the picture the best way I could. I dumbed it down to 422 and recorded in D5 machines where there's much rich metadata accompanying the images, even if they're separate files on disks. I made a virtual movie. And I used that movie to proclaim loudly to the entire community, this is a 4:4:4 camera and because you didn't build the machine, I had to it at 4:2:2. Who's gonna build me a machine to record this?
How many recorders are there now that can not only do 4:4:4 but for that matter 16 bit TIFF? Just by virtue of having made demands from the community. I won't take the credit for the fact that maybe a dozen machines like this exist. But I will take credit for having spoken into that vacuum very early on.
And I can tell you what the looks are what you get when you do that.
Dave: One of the obstructions to automating the motion picture workplace is that we don’t have a tradition of metadata on set. We have a tradition of what I call “metapaper.”
For example, script supervisors for the most part take a paper copy of the script, and note vast quantities of metadata in real time just by watching the movie being filmed: script changes, which actors are in each shot, and so on. And they notate that using lines and squiggles and arrows and notes all over the typed script, with hand written notes to elaborate. They accumulate vast quantities of paper that people have to keep in notebooks.
The first assistant and the second assistant, all the cameramen, the loader — these people keep vast amounts of paper notes too. If you want to know what lens they were shooting with, or if you want to know what filters were on the camera or what settings they shot with, you have to dig out that notebook and find the page that you want, and hopefully it’s in the right place.
Now you want to find out the tilt angle for this particular CGI shot, approximately in degrees -- the best that the visual effects’ people were able to determine by standing there and looking at the heightened crane, which is 20 feet in the air and trying to guess what the tilt angle was for the given shot. Visual effects people have data wranglers to keep vast amounts of their own paper notes.
And you have to find that notebook and dig it out -- sometimes the notebooks for a production aren’t even all in the same place!
All of this metapaper exists separately and independently of the images themselves!
Once you don’t have to have those notebooks stacked in shelves, it becomes a downhill rush to automate all metadata coming to the editors. It’s a small step from there to attach the metadata to the picture files themselves, and to preserve that information as it passes from machine to machine in post.
FROM BATMAN TO ARTICULATE DEMANDS
Dave: The Tim Burton Batman movies were where I first used my hand-created data capture system.
In the earliest days of live action motion control and data capture, we had a shot that started in a macro closeup, then boomed up to 60 feet in the sky. The question everyone asked was, how in the world are we ever going to focus this thing?
We ended up attaching an encoder to the crane arm to give us a numerical value for the position of the camera at any given azimuth. We then wrote a lookup table as an “if/then” equation. If the arm has boomed up 6 feet, then the focus should be set at 6 feet. If the arm has boomed up 12 feet, then the focus should be set at 12 feet.
I’m oversimplifying, but it’s easy to put a motor on a focuser.
What we ended up doing was going down to the arm and attaching an encoder, so that for any position of the arm swinging up we had a numerical value for that position. We then wrote a lookup table, or translation table, as an if-then equation.
Once you write that lookup table, and you swing the arm, and the arm data drives the focuser, there’s no mistake to be made. You have the numbers. It’s just an equation.
And while there was a motor involved in that task, ultimately we realized that that was all you ever needed, and you could figure out the rest.
But if someone is actually focusing the camera and someone's actually pushing a dolly and actually hand- filtering the camera, that doesn't mean you can't record it.
It turns out that you can record anything that you can measure. So for “Batman Forever” I built a little kit, and Panavision, to their credit, built me three encoded PanaHeads that had differential encoders on primary axles, recording pan and tilt, and converting that to degrees, and saving that data.
Then I put a little puck wheel on a dolly. As it rolls, it can measure tracking distance usually within greater precision than16ths of an inch.
For swinging the arm of a crane, the same thing: you put an azimuth encoder to the chain of a Titan crane or you put inclinometer encoders on the side of the arm. When you read how many degrees of tilt the arm is going through, you know exactly what height the crane is at.
But we discovered there was inherent noise in those pendulum encoders. For example, if you start booming up and pushing the dolly at the same time, it generates an inertial noise -- a lurch as the movement begins.
I also discovered that if you put a pendulum encoder on the right side of the arm where it moves, and another pendulum encoder on the left side of the chassis where it doesn't tilt up, and you subtract the inertial noise, you have a noise graph that tells you how quickly the inertial jolt moved in addition to the tilt up of the arm. Subtract the noise from the arm movement, and you have pure arm movement. And that becomes extraordinarily valuable.
So we were able to record all these axes of movement, unobtrusively. There was a little extra wiring on the dolly that we ran through a nice little cable harness, on down to an RS 422 line connected to a computer sitting off to the side.
Tim: How did we get from your hand-crafted systems to something more universal?
Dave: I had a meeting with some of the fellows from the Fraunhofer Institute in Europe. They saw that I was building and developing my own hardware and encoding systems, and then strapping it all onto dollies and cranes and arms. They said, “We get that you like to tinker and build this stuff yourself, but your greater value to the community is in making articulate demands of the rest of the hardware community, so that they get other people to build it for you.”
That was very liberating to me, very freeing, just realizing that the community can make demands of manufacturers.
And now, Panavision have a data port out of every Technocrane they own. You can walk up and plug a data capture system into the base of a Technocrane and record every move for every frame. I wrote the connector standard for them, so I know. [Laughs]
Motion capture interface on the Panavision Technocrane
More than that, it’s just a big conduit, a data pass-through for anything attached to the Technocrane, including any camera and lens data that can be collected off those machines.
Tim: Can you also collect metadata from non-Panavision cameras and their lenses on the crane?
Dave: Yes! If the camera and lens and head send out data, it will pass through the crane. So you can put an Arriflex camera on that crane and record all the data.
Tim: Now you’re talking!
Dave: You know, Panavision actually got involved very early on with putting encoders in their lenses. The guys at Fujinon also developed a system to output data from their lenses for George Lucas to use on the first digital Star Wars movie. Arri have taken a somewhat a proprietary approach to packaging their data. But they’re starting to see the logic of open source.
So there have been baby steps, but the Cooke /i Lenses are the first committed, open source invitation to everyone to embrace gathering metadata from lenses. If you look on the Cooke Optical website, you can download a PDF file. “Here is the standard, here are the connectors, here is how it’s wired, here’s how the data comes out. Do with it what you will. It’s open source.”
They completely have the right idea. It’s up to us in the community to demand that the rest of imaging chain deliver data recorded to the images themselves as they’re gathered on set, in ways that everybody else can use.
Tim: It sounds like you’ve done an awful lot in terms of moving film production into the future with on-set metadata so, what do you want to do now?”
Dave: The problem is that in order to capture metadata, you have to agree how to name it, what it means and where to put it.
There are over 2000 fields of metadata defined in the industry dictionary. What we did in committee was apply for and receive an ASC node to the metadata dictionary so that we could define the fields of metadata that we felt were important on set, for inclusion into the metadata header
We wanted to be able to assure that our work and visual effects work could be automated. And the only way to ensure that is to include it in the dictionary.
Dave: Once we have metadata everywhere, everyone will look around in shock and awe and ask each other, “How did we ever make movies without this stuff?
On-set metadata collection will become as ubiquitous as the walkie-talkie. You know, how did we make movies before we had walkie-talkies? Well, we shouted and stood on the side of the mountain and sent semaphore to the guy on the next hill. We sent smoke signals! Fire a gun — that means “GO!”
And now, you look at all the walkie-talkies on a set and don’t even think to ask anymore how we ever made movies without them!
Well, when on-set metadata becomes useful and ubiquitous, we’ll be saying the same thing about it then. Instead of waiting for all the pieces of paper from the script supervisor and everyone else on the set to arrive in an envelope at the production office each night, we can have digital metadata, collected automatically on-set, delivered even as we’re shooting.
You know, if you can turn the focus barrel of a lens into data, you are to be able to turn the meaning of a script supervisor's wavy line on a tablet computer into the proper kind of data as well. In fact, that should be a trivial task compared to encoding a lens.
The amount of information in today’s physical metadata — script notes, camera movements, camera settings — is trivial, insignificant in size compared to the actual picture or sound data we’re already collecting. But getting it attached to the picture and sound data is NOT trivial. And it won’t happen unless you ask for it.
The question is being asked. The answers are being provided. It just takes time for the herd to move in that direction. So, every chance that I get, I speak to the herd, and I speak to the possibility of what we could be doing.
The tools of metadata can and will enable authorship of images, control of look management, efficiency in visual effects and editorial, and make better movies while saving the producers and studios money!