I Made a Short Film Set in Mumbai That Ends on a Rocket. Here Is Every Decision I Made.

A director's commentary on Dabbawale — the 52-second AI short film about a Mumbai dabba delivery man whose route extends to zero gravity. Every creative decision, in sequence, with the reasoning behind it.

Written byRizzGen Team
Published onJune 26, 2026
Reading Time8 min read
CategoryProduction Breakdown
Surreal concept art representing Dabbawale short film. Surreal concept art: a dabbawala delivery to a rocket in zero gravity. Abstract 3D render by RizzGen.

The idea was simple.

A Mumbai dabbawala — one of the delivery men in the city's famous tiffin carrier network — finishes his route. And instead of turning back, the route keeps going. Past the last address. Through a gate. Into an ISRO facility. Onto a rocket. And then, impossibly, into zero gravity, still holding the tiffin box, still delivering.

No dialogue. No explanation. Just a man doing his job, and the job taking him somewhere no job has ever taken anyone.

Fifty-two seconds. One idea. A specific visual logic I knew before I started generating a single frame.

The entire production session is public — you can read every prompt, every creative direction, every concept decision I made: dabbawale session. This post is something different from that. It is not the session. It is the thinking beneath it — the directorial reasoning I had before and during production that shaped what you see in the final film.


The Premise Decision: Specificity Over Concept

The first creative decision was not visual. It was specificity.

"Mumbai delivery man" could have been anyone. A courier. A food delivery app rider. A postal worker. These are all real and interesting, but they do not have the cultural weight of a dabbawala.

The dabbawala network is one of those things that sounds invented until you look it up. Thousands of delivery workers, operating across Mumbai's railway network with almost no technology, achieving a delivery accuracy that has been studied by business schools as a model of logistics. They carry home-cooked meals from middle-class households to office workers across the city. The meals travel on trains. The system works through a coding system on the boxes, not digital tracking.

I chose dabbawala specifically because the scale contrast was already built into the premise. These are ordinary men doing an extraordinarily precise thing, with ordinary equipment, in an ordinary city — and they are famous for never losing a delivery.

The extension of that logic to orbit is not absurd. It is consistent. Of course the dabbawala delivers to the rocket. He has never missed a delivery. Why would this one be any different?

This specificity — cultural specificity, not invented specificity — is what keeps the film from feeling like a random AI visual. The dabbawala is a real thing. The premise has a logic. The comedy and the pathos both come from understanding what the dabbawala actually represents.

The directorial lesson here: before you think about visuals, think about the specific real-world thing your idea is rooted in. Generic AI video is generic at the premise level before it is generic at the visual level. Specificity of idea propagates into specificity of image.


The Color Decision: A Temperature Arc

The most deliberate visual decision in the film is the color temperature shift.

Mumbai → warm. ISRO/space → cool.

This was specified before a single scene was generated. It is in the concept plan, not discovered in post. Here is the reasoning:

Mumbai street photography has a characteristic warmth. Orange light off stone buildings. Yellow tungsten from street stalls. The blue hour light that settles on the city between dusk and full dark. Rain-wet streets that reflect all of it — reds, oranges, the occasional splash of neon. It is an inherently warm visual environment, and that warmth carries emotional associations: domestic, inhabited, human-scale.

ISRO and the space environment are the opposite. Government facility lighting is fluorescent, flat, institutional. Space itself is black and absolute, with the harshest directional light imaginable — the sun unfiltered by atmosphere. Steel and concrete and the cold geometry of engineering.

The shift from warm to cool across the film's runtime maps directly to the emotional journey: from familiar and human to alien and extraordinary. The color does not describe the journey — it enacts it. The viewer feels the world getting stranger before they can articulate why.

I specified cool highlights and warm shadows for the Mumbai sequences specifically — this is a cinematography technique associated with a certain look in Indian commercial photography, where the shadows are filled with amber and the highlights are pushed toward teal. It creates a specific richness that is distinct from western cinema's version of warm. This was in the visual direction for scenes one through three.

The space sequence inverts this: cool shadows, the absolute darkness of space, with the only warm elements being the tiffin box itself and the reflected light from Earth in the background of the final shot. The human object stays warm in the cold environment. That contrast is the film's last image.


The Camera Decisions: Why Each Shot Moves the Way It Moves

Opening street scenes: moving, observational, slightly handheld.

Dabbawala documentation in the real world is almost always shot this way. The cameras that have followed these men for news segments and documentary features have always been in the environment with them — on the trains, in the crowd, moving through the city. A static, composed shot would immediately signal fiction in a way that contradicts the film's emotional approach.

I specified "observational handheld, mid-distance, following action" for the street sequences because I wanted the camera to feel like it belongs there. Like it is one more person in the crowd.

The ISRO gate: static, symmetrical, institutional.

The moment the dabbawala passes through the ISRO gate, the visual language changes. Static. Symmetrical. The gate is centred. This is deliberate — it is the first signal to the viewer that something has shifted. The camera is no longer following. It is placed. It is watching.

This kind of composition — static, symmetrical, a figure walking through an institutional frame — has a particular register. It is somewhere between a surveillance camera and a Tarkovsky shot. I did not specify Tarkovsky because that reference can produce unpredictable results; I specified "static symmetrical institutional, figure walking through centre frame, cool fluorescent overhead lighting."

The launch sequence: macro, texture, then scale.

I wanted to avoid the obvious choice for a rocket launch — the wide shot, the dramatic pull-back, the full size of the thing against the sky. That is the awe shot. It is what every launch sequence does.

Instead: close on the dabbawala's face, then close on the tiffin box in his hands, then a shot of his feet on the gantry. The rocket's scale is implied, not shown. You feel the wind and the heat before you see the launch. The dabbawala's expression is not awe — it is attention. He is watching the procedures with the same focused attention he brings to sorting tiffin boxes at the station.

This restraint makes the subsequent cut to zero gravity more disorienting. You never quite saw the full size of the thing he got on.

The zero gravity sequence: slow, weightless, long lenses.

Slow camera movement in space photography creates a meditative quality — the same effect as slow motion in another context, but achieved through the pace of the camera's own movement rather than the speed of the subject. I specified long-lens shots with very slow movement for the space sequence. The sense of silence is partially visual in AI video — it is encoded in camera pace.

The tiffin box floating in zero gravity is the film's image. I specified it should be in the foreground, Earth in the background, the dabbawala's hand reaching to retrieve it after it drifted. Not tumbling, not spinning dramatically — just floating, with the gentle, purposeless drift of objects in weightlessness. The delivery man retrieves it. Steadies it. Holds it.

That is the last image before cut to black.


The Rain Decision

It rains in the Mumbai sequences.

This was not a late addition. It was in the visual direction from the beginning, and the reasoning is specific.

Rain in Mumbai street photography does several things. It creates reflection surfaces — puddles become secondary light sources, multiplying the colour sources in a frame. It creates a quality of atmosphere that diffuses light slightly, softening shadows and creating the kind of image that has texture rather than harshness. It creates a human visual context — umbrellas, people moving quickly, a specific body language that is familiar to anyone who has been in a monsoon city.

More importantly for this film: rain is ordinary in Mumbai. The city is one of the rainiest urban environments on the planet. The dabbawala delivers in rain without adjusting. He does not slow down, does not stop, does not appear to register it. The rain is part of the visual environment, not a dramatic event.

This ordinariness of the rain is what makes the extraordinary destination feel like a natural extension rather than a rupture. The film's logic is: nothing disrupts this man's delivery. Not rain. Not traffic. Not space.


The Tiffin Box: The One Warm Constant

Everything in the film changes. The city gives way to the facility. The facility gives way to space. The warmth of the street gives way to the cold of the infinite.

The tiffin box does not change.

It is the same object in every scene. The same scratched metal, the same distinctive marking on the lid that identifies the delivery address. I specified that the tiffin box should be visually consistent across all sequences — same size, same patina, the same quality of domestic ordinariness that makes it absurd and somehow moving in a zero-gravity environment.

The box is the film's argument. It is proof that wherever the dabbawala is, whatever environment he is in, the task is the same. The delivery must happen. The box must arrive.


What the Session Looks Like

The full production conversation for Dabbawale is public and requires no account to view. What you will see in that session is the translation of the above thinking into specific production directions: the concept plan where the colour arc and emotional logic are laid out before generation begins, the scene-by-scene visual direction where camera and lighting decisions are specified, and the conversation where I redirected specific scenes that came back slightly wrong.

The session is not a neat, linear process. There are moments of redirection — a scene that came back too static, a colour grade that was too cool too early. The session is a document of a production process, not a demo of perfect first-take generation.

That is what I wanted to make visible. Not "AI makes films flawlessly." But "a human directing AI, making decisions, catching problems, correcting course." That is what filmmaking is, regardless of what the execution layer is.


The Question I Am Most Interested In

When I shared Dabbawale, the response I was most interested in was simple: does it read as a film, or as an AI experiment?

That is the real question for anyone making anything in this medium right now. Technical capability is not the bottleneck. Every major AI video tool can produce visually impressive frames. What the field has not figured out is whether it is possible to make something that feels directed — that has a logic, a point of view, a reason every shot is the shot that is there.

Dabbawale is my attempt at that. The specific colour arc. The camera logic that changes as the world changes. The rain. The tiffin box. The restraint in the launch sequence.

These are all choices. Made before generation. Made with reasons. The AI executed them. The choices are mine.

That is the distinction I am trying to prove is meaningful.

Direct Your Vision

RizzGen is built from the ground up for creators who refuse to let AI compromise their aesthetic standards. Stop wrestling with prompt randomness and start directing your AI execution partner.

Start Creating Now or email us directly to share your creative workflow.

About RizzGen

We're building scene-based AI video tools for creators who need consistency and control. Founded by indie hackers who were tired of prompt gambling. Based in India, building for the world.

Questions? Try RizzGen or reach out at [email protected]