How often do we have nice regular subjects like this? Hardly ever.
The pixel race continues: more resolving power is being crammed into smaller and smaller physical sizes. The recent Hasselblad X1D announcement is at the pointy end of that: we now have medium format resolving power and tonal quality in a package that’s smaller than many 35mm solutions. I have a theory about resolution and the megapixel race and perception. Aside from the marketing reasons why 100>50>24 and must therefore be better, there are much more fundamental reasons why we feel the resolving power limitations of digital far more acutely than film. And it isn’t just our ability to pixel-peep with ease; it’s more to do with the fundamental nature of the world. Yes, there’s sufficiency in output because of the limitations of the output device itself, but I suspect these will catch up and exceed capture very easily. Allow me to explain why, and why I think there’s a way out that might well result in a very different sort of sensor…
The nature of film is fractal, irregular: the photosensitive particles are not the same size or shape, or even a regular size or shape. The nature of digital is regular: each photosite is the same size and shape and distance apart from the next one; the entire sensing array is a grid. The world is definitely not made up of elements of a discrete size and shape – at least not til we get into the quantum realm and far beyond our ability to see 🙂 Remember, our eyes too are an irregular capture device – both because of the layout of the cells, and because our eyeballs are always continuously moving and scanning (and thus averaging) a scene.
I think the disconnect is obvious: we are trying to represent something irregular with something regular. The only way you can make a curve out of line segments is by having discrete breaks; the more breaks, the closer your approximation becomes to the curve, but you will still never have a curve regardless of how many breaks and segments you use. This is digital photography in a nutshell: we are trying to represent a curve with line segments, and the only way we can improve transparency is by increasing the number of segments (resolution). The illusion of transparency is only achieved when the increments are not perceivable – whether it’s because of the limitations of our eyes’ resolving power and viewing distance, or the output medium itself. But given any sufficient enlargement, you’re always going to see the steps. In this way, it’s very obvious (beyond other digital artefacts such as over sharpening haloes or posterisation or strange colours) when a file has been output beyond its maximum size.
Furthermore, the same is true of tonal levels, too – not just spatial resolution. If you only have two possible outputs – white and black – you can only approximate grey by averaging if your white and black points are sufficiently high frequency to trick the eyes into seeing a continuous average tone. (This is how halftone printing works.) It’s the reason why we want as many color steps as possible – the more bit depth per channel, the more discrete steps, which means the closer the illusion to continuity we can create.
I’m sure you’ve noticed enlargements of older film images don’t seem to suffer from the same problem, no matter how big the enlargement. Sure, the definition may appear to be less acute, but you never get the sense that the subject is a facsimile made of Lego. Why? Because the entire process does not take place in discrete increments: the capture medium is nonlinear; the optical enlargement process is nonlinear, and the output media has more resolving power than the input signal, and it too is nonlinear/ non-discrete. The same thing is true of the tonal scale: it’s nonlinear, and continuous. The result is a continuous image in all dimensions – spatial and tonal. From an artistic standpoint, it means we are not distracted by the artefacts or the discontinuities, leaving the focus on the subject and composition. This is of course the ideal presentation of an image.
From a photographic standpoint, I don’t think more resolution is the solution simply because the race never ends: output and input keep increasing, with increasing processing overhead and the feeling that there’s always a bit more one can eke out. There’s a very tangible difference between a 360PPI print and a 720PPI Ultraprint, but it has to be seen in person and can’t always be quantized. The reality is that it isn’t always the raw spatial resolution that makes the difference: it’s the illusion of continuity of tone and detail that the increased resolution provides.
In any case, printing itself is a good example of a largely non-discrete output in practice; despite the best intentions of the printer-makers to have the head lay down uniform coloured dots of ink over each other, thereby creating a discrete reproduction medium, the reality is at that scale – picolitres – and on that irregular medium – fiber paper – the ink dots diffuse out and spread. And they don’t always hit the same place; in fact, they’re probably close enough that they merge with the dots from the next input pixel. And so forth – the smoother and more irregular the merging on that scale, the more continuously toned the print will appear. It’s also the reason why a good printmaker will often add noise to an image – the subtle variation in color and luminance can help to perceptually ‘fill in’ areas of color or tone that look too uniform to be plausible.
Here we come to the crux of the matter: I think the limitation of digital lies in the geometry and shape of the photosites, not the number or density. If we had irregular pixels (which could be a repeating pattern over a much larger area) the sensors would be able to better match the irregular nature of the subject (which in turn forms an irregular input signal). The pixels could be each the same area to avoid gain problems, and they could be fairly large in order to maintain good noise, color and dynamic range characteristics. Fuji tried to change the sensor with diagonal arrays and small-large arrays, but they were still fundamentally too regular – resulting in even worse strange artefacts because we were now trying to output an image with an underlying 45 degree structure onto a 90 degree orthogonal output medium. The files printed well, however, because of the dithering process during print explained previously.
Let’s see an example to make this clearer.
This image (larger here) is originally a 90-megapixel stitch. It has a lot of resolving power; a scene with a large amount of irregular fractal detail requires it to make a convincing print.
Here is a crop of the centre: (click the image for 100%)
What happens if we simulate an irregular-shaped sensor array imposed over the top? I will use Photoshop’s Crystallise filter to do this, which turns the image into blocks of a size of your choosing. Let’s start with an average five pixel diameter:
At web sizes, there is no visible difference. At all but the largest print sizes, there is little difference, too. Yet at the pixel level (again, click the image for 100%):
Clearly, there is less resolving power here. But does it seem as though we are only looking at ~5 megapixels (the tiles are somewhat less than 5×5 pixels in size) No, not at all. It doesn’t look 25x less detailed. Imagine pixels with five times the area though: what dynamic range and color properties we could have, assuming the same underlying photosite technology! Let’s move to a ten pixel diameter:
Still no difference at web size. You’d have to halve the size of your print, though. Once again, click the image for 100%:
This has clearly become quite crunchy. But not so crunchy as we would think: remember, we’re now at approximately 60 times less resolution, and at about a 1.5 megapixels. I’m not saying this is an optimal level, but the reality is not as bad as we would think. What if we go extreme, to a 20-pixel diameter?
Still looks fine at web sizes to me. At 100%, it could be an abstract tile mosaic:
There’s now a whopping ~250 times less information here – for a grand total of 0.36 megapixels. It doesn’t hold up at 100%, but I bet even a fairly large enlargement won’t look that bad – you won’t be seeing this:
…which is what happens if we use a ‘conventional’ regular 20-pixel grid, which looks compromised even at web sizes:
I think it’s clear to see two things: not all pixels are created equal, even if there are the same number of them, and if we moved to irregular pixels, the perceived transparency would be a lot higher. We could get the same perceptual image quality we have now with fewer pixels (and lower density) – meaning more light collection per pixel; or we could get even better image quality with the same number of pixels. Of course output would have to be handled differently – probably some sort of upsize-downsize process to display the irregular pixels on a regular grid (since matching irregular input to irregular output would be a colossal artefact-laden nightmare of dividing two irrational numbers). But it’s an interesting thought to ponder while we wait, no? MT
__________________
Masterclass Prague (September 2016) is open for booking.
__________________
Visit the Teaching Store to up your photographic game – including workshop and Photoshop Workflow videos and the customized Email School of Photography. You can also support the site by purchasing from B&H and Amazon – thanks!
We are also on Facebook and there is a curated reader Flickr pool.
Images and content copyright Ming Thein | mingthein.com 2012 onwards. All rights reserved
Great post and example! 🙂 I already knew that noise can improve the perceived sharpness, but I didn’t think altering the pixels’ shape would have a similar effect.
Hello Ming,
Interesting that you bring this up. Images from a digital camera often strike me as boring, particularly compared to the joy I get out of my 4×5″ sheet film view camera images.
Today, after reading this post, I carried out an experiment that has been calling me from the back of my mind for quite some time. What I did was to scan (at 2040ppi) a fully overexposed (and thus completely ‘white’) 4×5″ Provia color reversal sheet into a digital file. I then reworked the digital image so that the film grain now represents slight brightness variations around the mid-grey mark. Following that, I picked a digital (DLSR) image showing large smooth areas of the same color and little texture (i.e. painted panels on the side of an airplane) and added my scanned and reworked film grain image as an overlay in Photoshop.
The results are quite interesting: smooth areas come alive, are no longer that dull, and make me feel much more at home given my primary use of sheet film. Interestingly enough: however the grainy overlay actually/technically blurs/spoils fine details to some extent, I actually -experience- them as sharper than without the ‘grain’ variations.
Again thanks for your generous sharing of all kinds of interesting finds,
Cheers,
Herman
No problem. What you’re seeing is not so much the detail but the suggestion of it…to trick our eyes into believing there’s more, and the image is as continuous as the world it’s supposed to represent.
I read your article but frankly have not read all the comments.
I foresee sensors going in a linear progression for a while. The big limit right now is cost to produce. We need to get Medium Format down to a point where a more average person could access it. The new Hassy is my new wet dream that I will likely never be able to afford. That said, it is a huge step in the right direction. The next step is getting to per pixel exposure to maximize data, resolution and dynamic range.
Let’s say you could expose all pixels to a range of 1-3 stops range of average grey. Then in processing the relative difference could then be used to recreate the image as seen with you having control of the relative DR. You could then structure the image as your artistic vision demands. For News organizations and contest the data could be encrypted in such a manner that the original file could be used to validate the image. For the rest we get the chance to create the image as we see it.
Food for thought…
The cost to produce isn’t going to come down anytime soon until there’s a change in the process itself; the finer the structures, the higher the chance of errors per unit area, and the higher the rejection rate – especially for larger sensors, which have a higher chance of containing errors if there’s more area.
Per pixel exposure will come when there’s enough computing power, and when they figure out a way of putting that circuitry somewhere it won’t block light collection…
Ming (or anyone)
I am a little surprised no one has mentioned matte or glossy screens and their influence.
Clearly it will be completely dependent on monitor quality, however when I saw a professional matte screen monitor, displayed in studio conditions, It had a much more natural and relaxed look than regular gloss monitors of the same resolution. The images were noticeably more solid (or real) and print like, though not sharper (nor lacking in detail either). I do not think it was dynamic range or colour calibration because it worked with graphics, monochrome and highly manipulated photographs. Maybe the matte screen is more like paper and resolving transitions more smoothly (but minutely softly) compared to a glossy screen.
Has anyone who has worked with a professional monitor in controlled ambient light conditions got any general observations regarding perception of images?
Is there any correlation with fractal or regular grids as discussed in the post?
Agreed – the screen dithers a little bit, much the same as matte vs glossy paper. I suppose the non-reflectivity of the screen causes a bit of diffusion at the edges of each pixel – though this probably defats the point of retina displays, though. I personally prefer matte screens, though have to work with glossy as most client output is consumed/viewed/assessed that way – there’s no point in making an image that doesn’t work for your target audience…
“The nature of film is fractal, irregular…” Once again the depth and breadth of my ignorance is revealed. All along I had thought the essence of things which are fractal is their regularity — a form of regularity which makes them infinitely scaleable. Capable of being enlarged to absurd dimensions without loss of form. Oh, well…
Back to the subject at hand: it may be that the problem is not the rigid grid structure of the digital sensor, but what is done with the information after it is obtained. Given an adequate number (whatever that means) of points of information, is it not possible that there are other ways of expressing them — something other than transferring the data to a larger grid? Vector images created in a graphics program can be seen on a monitor, but actually exist as a pile of mathematical instructions. They have to be rasterized into a pixel grid to be worked in Photoshop. How about going in the other direction? Can the data gathered by a grid-like sensor be translated into a vector image, complete with the interpolations needed to result in stepless curves?
No, you’re not wrong. I might have chosen insufficient words: fractals by definition are self-similar at different scales; which means that at a given scale, they have to look mostly irregular. Even then, there are the class of fractals that land up being symmetric (think reflected Mandelbrot spirals etc.) and repetitive, and those that aren’t repetitive at a given scale (think a rocky mountain). Both are fractal, both are irregular, and both are self-similar at different scales. Regularity and self-similarity aren’t quite the same thing.
I think somebody else proposed the vector grid too – and I think it’s a good idea except that there’s a big leap in pattern recognition that hasn’t been made yet in the form of trying to figure out what is the same object and thus a contiguous vector, vs a separate one behind it, for instance. There’s also the question of how to render tonal gradients as vectors – by definition, I think you can’t do it because you’d either have to heavily quantise the gradient, resulting in posterisation, or have so many vectors we’re back to single pixels/bits of information. Unfortunately, spatial resolution and tonal accuracy/resolution are also inextricably linked: the more resolution, the more tonal steps you can represent, the finer the perceived gradient reproduced.
Whilst ultimately the digital medium still results in stepping quantisation between adjacent areas at the pixel level, the aim is to try and break that perception by breaking up the regularity of the input or output – or both – so that the eye does not perceive the quantisation (and thus breaks in quantisation) first.
Edit: I just reread that and apologise as it comes across as a bit heavy. Hope that made some sense…
It does make sense. My grasp of all this stuff is at best superficial. Question posed, and answered.
By the way, there are a couple of other things I find interesting with regard to the expression of images using pixels, one highly regular and predictable, the other wildly variable.
The first is the use of multiple screens set at angles to one another employed in four-color printing. The composite which results when ink hits the page is not a rectangular grid but a series of multicolor rosettes with little divots of color in between them. Seen at low lines-per-inch rates, as in the coarse screens needed for printing newspapers (remember them?) the pattern is visible. But, in fine printing it’s invisible without significant magnification. The eye overlooks or fills in the blanks completely.
That leads to the second example: the art of painter Chuck Close, known for painting with pixels before digital pixellation even existed. Take a close look at the individual units which make up some of his large, photorealistic paintings, such as that early nine-foot-tall self portrait which almost leaps off the wall when you enter a gallery where it’s on display. The ‘pixels’ are irregular in shape, made up of multiple colors and forms, resembling nothing found in reality. But when put together in proper sequence they add up to an illusion of reality which can jolt the senses.
My apology for wandering so far off-topic. I guess the underlying thought is that it’s all illusion, just a question of how best to get there.
Chuck Close’s work is perhaps closest to what we’re postulating here – if you were to count the number of shapes/dots/units he uses, I think you’ll find it’s far less than the impression given. In reality, this is really a question of data compression…
There is already a very basic form of interpolation being done with the debayering process, which turns the 3 disparate color channels and pixels into one image: besides the color interpolation, there’s also some effort to recognize lines and draw them as lines. There may be more extensive stuff going on in the proprietary algorithms that Adobe, C1, etc. use. For example, I don’t find the open-source RAW converters be very good at a lot of things and prefer Adobe’s conversion. Others’ preferences may vary.
This is where things like fractal models may be helpful: since the RAW processing software (whether in the camera or on your desktop) has to make a guess, one tries to use a model of guessing that will best approximate the structures that you might encounter in the real world. Some say that fractal models are good for approximating natural things. Of course, if you have someone shooting very regular and geometric things, like maybe architecture, some other model may be better for that person.
But why do we have to guess in the first place? Isn’t the data all there, and as some posters mention below, there are mathematical frameworks for reconstructing data below a certain threshold exactly? The problem is that camera sensors don’t meet any of the criteria for exact reconstruction. Those with antialiasing (AA) filters try to do it, but they can’t because their AA filters don’t work very well for those mathematical frameworks. The sampling theorem assumes a pretty specific filter that has pretty steep requirements, and imaging sensor AA filters, even the strong ones, don’t even come close to those requirements.
So we have lots of extra information leaking and overlapping each other, and we have to sort it out. This is what drives everything else, even leaving out the Bayer interpolation that has to be done. No RAW conversion is perfect, and they’re all a different set of compromises.
This isn’t a new problem: the computer graphics people have known about this for more than 20 years, and in fact have proposed and implemented something similar to what Ming says about noise above. They use a supersampling (many samples per pixel) with sample points with a Poisson distribution. They were inspired by a similar distribution of rods in the retina since the eye has no effective AA filter either. (BTW the study was done on rhesus monkey retinae since it’s believed they are similar to human retinae.)
The mathematical consequence of this is that the artifacts that we get from sampling without an AA filter (or a bad one) are turned into noise-like structures that are perceptually less offensive, instead of occurring as highly concentrated structures that we see as jaggies and other artifacts. You’ve probably seen the results if you’ve watched any Pixar movie or CGI done by ILM in the last two decades, as this technique underlies many of their rendering algorithms.
So how do we use this kind of supersampling in sensors? One way might be to have a very high resolution sensor (hundreds of megapixels), but always downsampling to produce its output.
You’re on to something in the last sentence 🙂 In both cases though, portable processing power has some way to catch up.
I am not sure I can see how irregular shaped pixels are better in recording reality. They may be better in giving an illusion of more details due to fractal nature of the subject, which is fine, but one can get the same effect (illusion, not accurate representation) by adding random noise. (Disclaimer: I had just one semester of signal processing in under grad years. That’s all)
On a related topic, I have the following thought on megapixel sufficiency….iPhone megapixel may be sufficient for A4 print and for typical web output on a standard computer display but is not sufficient for your 5k27″ display. I think computer monitor and TV displays are becoming substitutes for big prints for many and these electronic display devices will increase in size and resolution. It means 24mp camera is not going to be enough even for general population in future.
Hi Ming,
Thanks for that experiment. But I see one problem. You cannot rescale 99MPIX picture to simulate lower resolution sensor. It give you much better pictures than you will get in reality. Ofc because of bayer matrix. So to properly test sensor resolution you have to use different sensor in the same situation.
Color resolution on bayer matrix is as we know 1/4 of pixel count. Look here great article;
https://www.onlandscape.co.uk/2014/12/36-megapixels-vs-6×7-velvia/
But I totally agree with you about fractal analog / digital matrix problem.
Thanks!
You bring up a valid point, though I’m honestly not sure it’ll make a difference as the tiling seems to also have a posterization effect – that throws the color resolution out of the window…
My survival formula has been getting back to film and be updated in scan technology. 5 years from now Film+Scan Tech will deliver better results than my 7/8yo 12MP “state of the art raw files”. With digital capture you will be always caught by obselescency laws, no mather thousands dollars invested. With physical capture and updated scan tech your old negs will improve and look even better than today. My two cents. Great article, right to the point.
I think we’re facing some chemical/physical limitations with film though. Even if scanning tech is better, which seems to be progressing even slower than the rest of digital capture (I’m guessing due to limited market size) – the film can’t record any more information. 35mm still maxes out at about 12MP equivalent. I agree the transition to ‘indistinguishable’ at the limits of resolving power of the medium is much more visually pleasing than the abrupt steps in digital – but that is also arguably a non-issue now that we have anywhere up to four times the spatial resolution in the same recording area…
What about 120 film? What would its equivalent be?
Depends on the size of the negative, of course. I guess a good rule of thumb is somewhere around 1-1.5MP per square cm of fine grained (color slide, B&W low ISO neg) film or thereabouts. This corresponds with my observations of a Hassy 6×6 (really 54x54mm) being about 30-35MP.
The problem I foresee is what happens in the future as technology improves – what happens if you want to re-work a ‘crystallised’ image? Perhaps the best solution is to capture and store the image with ‘regular’ pixels but output it with irregular ones best suited to the image itself?
What about ‘pixel combining’ at capture – the camera analyses the scene and combines pixels where resolution isn’t needed to improve dynamic range etc?
I think the best solution might be a hybrid one: regular pixel capture, upscaling to the largest crystallised form possible for output…
Interesting subject on what and when enough is enough. Well, it all depends. In my tired eyes, for a general comfortable viewing purpose , which in my case oscillates somewhere around 50mm lens AOF, deciding factor would be a limit of human vision accuity which I`ve read somewhere is for 20/20 vision 1arc/minute= 1.75mm at 6m distance. Everything beyond that in terms of resolution is of academic interest. Now, if you want to examine the print from 6cm distance with magnifying glass, then sky is the limit. Do I talk sense, I wonder?
Yes, and I went and did the math in the past; it comes to something like 1050ppi in an ideal world (and slightly wider than 50mm, if I remember correctly). I guess it would be around the 500-600ppi mark for that angle of view. In practice, it’s somewhat less because the output media at that resolution are usually almost continuous. If digital, we need a little bit more (i.e. effectively going beyond the resolving circle of confusion of the human vision system). What I do know is that we printed the same scene at 240, 360, 500 and 720 PPI: there’s a clear difference between the first three, but the last two are heavily scene dependent and on how good your vision is – I could see the difference between 500 and 720, but not most older people. Differences were clear with a loupe, of course.
Hi Ming.
What do you think about a return to the 3 sensor design of ‘old’ pro-camcorders? Whether three separate high res. sensors (1xRED, 1xBLUE and 1xGREEN) could be aligned accurately enough via prism with current technology and at reasonable size/cost, remains to be seen, but it would solve the artifacting problems caused by a scene which contains moving elements.
Lloyd Chamber’s work with the Pentax K1 has shown just how damaging the demosaicing process is to image quality…
Might be interesting – the closest thing we’re going to get is probably the Foveon derivatives. Precise alignment of the three sensors is going to be a major headache especially at decent resolutions, though.
The problem with pixel shifting or multishot modes is that the real world moves…
Very interesting reflection! Even though it might be outside the scope of this article, I’m very interested in semi-random patterns that fill a surface. Although I’m not an expert on sensors printing process, I image that it would be very difficult to implement QC on a chip with a random pattern. Having that technical hurdle out of the way, a corollary to this idea would be to have every sensor with different pattern. It’s as close as you can get to film.
I think so too. Doing it once might be hard but repeatable enough; different patterns for each sensor would be impossible because the underlying circuitry has to be laid out, too – and this can’t be quite as random. At this point, we may simply not have the technology yet…
Ming – As always you make me think. I’ve thought about this post, as well as Das Wimmerlbild, and the Output Disconnect and the Future of Image Viewing, and it raised this question: “If I’m not likely to print many of my photos, and want to see/work with them at the highest viewing resolution possible (in my case an iPad Pro 12.9″, or I was thinking of adding an Apple 27″ Retina iMac since I don’t need the portability that you do), at what point does my output become the limiting factor?”
In other words, right now, following your earlier suggestions, my camera gear consists of a Nikon D5500 with the kit lens and a 55-200 Nikon VRII telephoto. I wonder to what end the new gear would be adding anything significant to my output? If I can’t see the difference on a video monitor, or iPad Pro, and I’m not likely to print many images at large sizes, is there still a benefit to having massively more resolving power or resolution than my current gear will create if I can’t view on a pixel level? I believe the iMac 27″ Retina has a 15.5MM pixel display. What happens to all the “extra” pixels my sensor captures? How do they get converted to fit the smaller size display limitations? What about the 50MB+ Canon and even larger MF images?
As a related question, I’m also unclear about the sensor format to purchase. I noticed that your Ultraprints are “modest” in size – typically not the 20×24″ poster size, but something more intimate to allow the viewer to capture the entire image easily.
How should I think of these issues, and if forced to choose, where should the compromise be? Do I keep the DX size sensor, or is there a benefit to FF or MF that I will actually be able to experience? Do I upgrade viewing options by investing in an iMac 27″ Retina?
Thanks for making me think things through – it was very worthwhile to see outside my comfort zone. It seems to me the best investment is studying more of your video tutorials, especially learning the technical nature of Photoshop and most importantly how to “see” things as a photographer, not a snapshot taker. My ultimate goal is to learn how to create and apply the Das Wimmerlbild perspective to my photography, and to present those images in the cinematic style that you have developed so beautifully.
By the way I’m thoroughly enjoying your Making Outstanding Images series!
Depends on your input. I have the 5K 27″ iMac, and it still only shows about quarter of what the 50MP cameras capture. Taking Bayer interpolation into account, I think it shows the equivalent of a 24MP ‘normal’ sensor worth of information – so anything more than that, for the time being. Remember that each output pixel on the display has separate RGB values – with most digital sensors, the RGB values at one spatial location are interpolated from the neighbours. Each physical location only records R, G or B – this is the Bayer matrix. In short: what you’ve got now is quite well matched in terms of input-output.
Ultraprints: those are significantly higher in information density, which means that I’m input resolution limited, not output – theoretically I could do 44″ (with the current printer, potentially up to 60″ with a different one) width x the length of the paper – 100feet.
In every case, there’s diminishing returns at work. So whilst a hypothetical 27″ 8K display and medium format will look better than 24MP APSC and 5K, the difference might not be that obvious or present under every condition – and small things like camera shake or bad exposure and subsequent recovery in PP may well land up negating the difference entirely.
Thank you very much for your reply, and your gracious manner. I’m nearly finished with the “first round” of the Making Outstanding Images series, and I plan to follow that with added classes. Any suggestions are welcome! It’s been many years since I’ve dedicated time to photography – you’re a big reason why I’ve returned. Seeing your images, watching your tutorials, attending the Gear webinar, all combined to rekindle my interest of decades past. I still like holding the old Hasselblad negatives – there’s something magical about them.
No problem, and thank you! Post-Outstanding Images, I’d try How To See, and of course the post processing stuff – once you decide which style you prefer.
Len a worthy line of questioning! Whether to “up grade” or not depends on a few things:
– If you are going to print and seek the best quality you can possibly get such as Ming,
– also based on what you shoot. Most cameras are more than adequate with enough light. However if light is out of the photographers hands then some formats and sensors help maintain a sound file quality
– similar to above point but specifically the DR of the camera allows you to achieve what you set out to do with quality.
I own a D750. I think it’s a great camera at the price. However I do quite a bit of street and travel photography and purchased the XPro 2 a little while ago. FF vs ASPC camera. Both 24MP. After trialling it for a few sessions I took the XPro 2 back. To me it struggled to retain image quality in the out of focus darker areas. Street photography can mean working with fast changing variable light. So for me it was not the right tool. No doubt my technical inadequencies may have played a part.
On a recent trip to China I took the D750 and one lense, 24-120 f4 VR. With a “slower” lens the D750 handled those same types of situations much better than the XPro. The FF sensor vs the APSC variety helped in the DR stakes, ISO output & return and subsequently allowed for some high quality prints.
Obviously this is not in same league as Ultra Prints and Hasselblad capability though hopefully something to think about.
For what it’s worth, the 24-120/4 VR is also one of my workhorses for the Nikon system (review here). By f8 and with care in focusing and stability, it’s pretty good.
I’m afraid there’s one big fault in the example: Photoshop chooses optimal mosaic tiles given the source image, not arbitrary (I didn’t check but it’s quite obvious looking at the pictures). Obviously a camera sensor cannot do that, and random tiles wouldn’t look any better than a regular grid (which IS random from a real-world subject’s point of view) or possibly even worse – cue demosaicing issues. The only real solution is to oversample enough in capture (i.e. add more pixels) and perhaps some clever technique in output (regular grid may not be optimal for a digital display when moving beyond perception limits, but I’m not sure it really matters – adding noise may also work for high-res displays). Increased resolution also improves colour accuracy and tonality, as you often point out; there are many good reasons Lloyd Chambers promotes oversampling beyond lens’ maximum sharpness.
“.. regular grid (which IS random ..”
Not from the way our eyes work, as far as I can see, they make lines (like a rectangular grid) more important than speccles (like e.g. the leaves in these images
True, but you get a different result every time you run the filter – and it looks equally passable. So I suspect that the tile configuration might not actually matter too much …
The difference is that you’d get the same amount of contrast as the regular grid version. Just look at the difference in the above examples, it’s obvious why the mosaic looks so much better. The same would go for colour variation, but the example is mostly green.
Also, if you oversample the input (to retain those small peak highlights as far as the lens can resolve), but have large random shaped output “pixels”, you’ll run into the same issue (averaging the data over an area). The only potential benefit would be losing the perceptually strange regular grid, as you do on print. However, hard edged pixels of random shape would probably introduce weird artifacts, so you’ll need an entirely new display technology, and you’ll also need all that detailed input processing for a much less detailed output. Therefore increasing both input and output resolution is the primary solution.
You’re right, of course. Practically introducing a mosaic would be nigh on impossible; increasing resolution beyond our ability to discern (I guess the equivalent to the eye’s Nyquist limit) is the only way to make a universally scalable input to output – this was more of a thought experiment in perception…
Ir’s still useful to think about organic/fractal type subjects and the related workflow bottlenecks. I feel that those are the only cases where I might be limited by gear AT ALL, though photographer skill and patience are definitely the bigger hurdles in my case.
By the way, what lens qualities do you find most important when shooting this kind of subjects? For some reason I prefer the ‘look’ of typical macro lenses, even at longer distances, but I’m not able to quantify it. I don’t own anything close to Otus, of course…
Until recently, dynamic range has still been a problem, I think – this is regardless of subject matter. Less so now. Remember that more resolution does mean more tonal subtlety since finer spatial transitions can be represented, too – this probably plays a big part in the medium format ‘look’, too.
Lens qualities: macro lenses tend to be quite planar and render fine micro contrast with flat field. I like the Otuses for resolving power and apochromaticness – which helps with separation – but I don’t like the slightly hard bokeh caused by the aspherical elements. In these situations my choice is the 2/135 APO, that as far as I know has no aspherical elements, or the 85 Otus which only renders harsh bokeh for very OOF point lights (which don’t happen often in nature). The 2.8/85 Contax Yashica Zeiss MMG is also excellent, as is the 2.8/35 PC Distagon.
Ya, I suspect that macro lenses are better corrected for chromatic aberrations, at the expense of lens speed (to limit size and cost). It seems that in-focus areas are better separated from the rest while maintaining decent looking bokeh. Zeiss 2/135 produces awesome results indeed, sadly Milvus 1.4/50 less so. Well, never mind, I still have work to do with Nikkors 🙂
In that FL range: try the 60/2.8 AFS Micro-Nikkor. I think you might be surprised… 🙂
My favourite lens already 🙂
This. The natural follow-up would be to determine exactly how the Crystalline filter works. My suspicion is that, even if there are random components, it ultimately takes cues from the image content, to preserve contrast and edges in the output of the filter. Alternatively, you could code up your own version of the filter that you ensure takes no cues from the image (it looks like the filter is based on voroni diagrams / delaunay triangulations, for which off-the-shelf libraries exist in Python) and see how it compares to the Crystallize filter.
Interesting how *much* more detail the “20-px random” image “gives” compared to the “20-px rectangular” image.
Or is it an illusion of detail due to the more fractal (=natural) representation?
Or is it the rectangular grid that makes us believe in lesser detail?
Very interesting.
My hypothesis: Our saccadian eye movements that detect lines makes the rectangular grid more prominent and distracts from picture details (and the brain’s limit of information intake speed helps) – in contrast to a fractal impression of more detail.
( What if both images get slowly blurred?
At what blur level would the difference disappear?)
BTW. This seems a very good way to do oversampling!
Photoshop’s “crystallize” filter throws away low-amplitude high frequency information, simply pixelating to a grid on the same scale throws away high frequency information without regard to amplitude. The reasons the “random” image gives more apparent detail may be several, but at least one of them is that it *has* a lot more detail.
Aha.
Thanks!
( Can the “crystallize” filter (or some other amplitude sensitive software) be set to produce square “crystals” in a grid
and so make a comparison that isolates the effect of a pixel grid?
Just to understand what goes on.)
I don’t know! I do think that “square” is not quite what you’d want here, but to do a fair comparison you do need a *fixed* arrangement of sensels which you apply to many pictures. Crystallize can be viewed as custom-generating a “randomly placed sensel arrangement” on a picture by picture basis. See Tarmo’s comment.
Well,
I just thought it might be interesting to try to see how much of the difference in Ming’s comparison is due to a difference in information content, and how much is due to the difference random pixels and a pixel grid make to our _perception_ of detail.
Because I am pretty sure Ming is right that the square grid is worse on our eyes.
Do it several times. PS applies a different grid, each time, which *perceptually* looks no different from the previous one when you view the entire image. I just tried, with an overlay (that interestingly averages out to Gaussian blur if you iterate enough times).
That was the point I was trying to make: for low frequency images without detail there’s little visible difference. The mosaic preserves the apparent visual impression without the attendant requirement of massive data sizes.
Many years ago I read about a then new compression algorithm using fractal principles.
It was said to preserve the appearance of image or sound at larger compressions than other algorithms.
( If I remember rightly, it took quite a bit more processing power to compress but could restore during viewing or listening.)
It’s an illusion, I think – but it again depends very much on your eyesight and output device.
Really interesting idea Ming, I think trying to better represent the curve as you put it, will indefinitely be ammunition for sales and marketing. I always felt film (or grain I guess) lent a texture to the image, and in large prints it was almost like seeing the molecules that made up the world which world lent a metaphysical property to a photograph. Of course you can add digital grain now, but I can never really intentionally degrade an image. When it comes to representing the real world there’s plenty of stages of irregularity before we get to the quantum world and even then things are by definition not predictable and there for a uniformed representation may actually be a less accurate representation. Perhaps there’s some innate human instinct going on in never being satisfied with digital… neither a totally analogue or digital medium is best but one that encompasses both as your idea may well provide. Perhaps if the make up of the photo-sites were also malleable in that they could float in some way, giving you a unique pattern for each image.
Interesting point with regard to quantum-level deconstruction: yes, basically everything has fuzzy, irregular, uncertain edges. The cells in our retinas are also quite irregular…there may be something in this.
“Perhaps there’s some innate human instinct going on in never being satisfied ..”
What we see is (as far as I have been told) what the brain recombines from what our eye movements of different frequencies input and short term memories of what we just saw (plus what memory tells that we ought to see).
( The high frequency eye movements sense edges, contours and lines.)
And I believe our brain gives a *continuous* (in space and time) representation to our “mind”.
( Or is the continuity what the ” mind” produces? No matter.)
That makes sense, and is how we perceive far more information than the raw photoreceptor count on our retina would suggest. In essence, the brain is doing both the over sampling and the non-integer sampling for us – and then averaging the result further for continuity. There are no discrete steps in vision. However, since every other reproduction system very much has to be a linear input-output relationship, we may never be able to replicate this.
Ming,
(just had a thought)
we have mentioned before feeling at home in a landscape we have roots in.
Is this (part of) it, that when the memory can supply a much greater part of our vision, then the brain has less work to build it, and we can feel more relaxed as there is more brain capacity left over to take in new things?
And this might be part of how we react to different kinds of Wimmelbilder. ?.
And perhaps the less work the brain has to do to subtract artifacts like e.g. remnants of pixel grids, the more relaxed we can enjoy/study the image (quite apart from immersiveness).
Edit:
Or maybe that is (a part of) what immersiveness is?
( Long live the *art* of photography!
But it is interesting to speculate on how it, and the tools of it, “works”. )
Understanding one’s tools is important to mastering the art…
🙂
Hmm…interesting. If I understand you right, the imagination or memory makes up the difference?
Memory, yes.
Imagination, I’ll have to think about that, but why not.
Other parallels, perhaps :
When you listen to (complex) new and unfamiliar music, it takes time to learn a new music-language before you really can hear what the music is about.
I guess that this learning in familiarising us with this new world by building new association networks in the brain also frees up capacity to really listen.
When a foreigner speaks your language I find that mistakes in prosody make listening and understanding much more difficult than mistakes in pronounciation. I figure that structural differences need more brain work than differences in details.
Definitely. Even your own mother tongue spoken with a different accent can be tough to deal with
Well, you know my position, Ming: I am a happy Leica S007 owner. Still, I guess Leica will feel the pressure from the increase in mpix from 50 almost as a baseline, up to (currently) 100 for the cutting edge. There are so many parameters to this discussion, including the design of the sensor itself. If you allow links here, I think there is an interesting discussion in the last part of this article on sensor design: http://www.reddotforum.com/content/2014/11/why-leica-is-staying-at-37-5mp-for-the-s-typ-007/
I print 50*70 cm from my S007, I compare it to a friend’s P1, and I have nothing to complain about.
I think people also sometimes forget that the sensor sits in a CAMERA that should work with a set of lenses, good autofocus, finder, general responsiveness, weather sealing, ergonomics etc. etc. And, those who step up from DSLRs will find how MF cameras are much more sensitive to camera shake – mpix won’t help you much if you can’t hold it still.
But it is certainly interesting times, each product release upping the bar, lowering the cost, and giving more choice.
Indeed. We’re way past enough, but not all ‘sensing photo sites’ are created equal – Sigma has proven the definition of ‘a pixel’ is somewhat fluid, and smaller sensors certainly have less pixel level integrity than larger ones (all other things being equal). Will the 100MP be hugely better than 50MP? I have no idea, to be honest – hand shake and deployability is one consideration, as are smaller photo sites, but then there’s a generational improvement in technology to counter it. What I *do* know is that MF color and dynamic range – not so much resolution – were the big gains for me over 35mm.
You are simply brilliant!
…as a person trained in psychology…perception has always been the key…as a humble photographer I have always be fascinated as to why my lower res photos sometimes seemed more alive…there is a parallel with digital music…
best you, giom
Thanks!
This was interesting. One could imagine a not too distant future technology where the pixels in camera sensors and computer screens are grown, more or less organically, merging advanced biotechnology with electronics. This would give us pixels that are roughly the same size, but all of them would be slightly irregular, and they would not arranged in a perfectly square grid. This could be sort of similar to film grain, giving us the effect you are describing here. Or maybe we would have endless debates about Adobe’s latest software not handling the organic pixels correctly 🙂
That may well work…or, yes, I’m expecting somebody to say something about X-trans very soon 🙂
That photography is not about megapixels has become absolutely evident to me when I bought Taschen’s album containing pictures from Alfred Stieglitz.Camera Work Magazine. These pictures are anything but very detailed, due to the limitations to photographic technology in Stieglitz’s time, yet their artistic merit is way beyond doubt.
You do not need to go very technical in explanations, I think, to convince people that resolution, while very interesting and attractive option, is not the main factor of artistic value. If you look into this book, it should be immediuately evident that composition should be our number one priority followed by the choice of subject. First I just wanted to say composition only, but I realized that that is not enough. This is, I think, the reason that photographers who churn out series of great compositions at the end still remain hungry at heart. Perfect composition of lifeless objects are like arabesques on the walls of Moslem shrines. Beautiful, very interesting, you appreciate their complexity, but you for us, human beings, something else is needed to reaqlly touch our soul. This is where the choice of subject comes in.
Agreed. In the end, photography isn’t about visuals, or capture, or even aesthetics. It’s about showing somebody else your imagination. The rest are tools…sometimes they matter, and sometimes they don’t. What this post is, is really identifying the divergence – and tying to help the reader figure out which side of the art-technician divide they sit on, and be comfortable with it. 🙂
Very impressive difference.
Irregular shape, irregular size and further increase in resolution = future of image capturing.
Ideally processor would “remember” size, shape and placement of each element and process accordingly. Can’t imagine how much computing power that would require…
For now I haven’t seen anything that would match film transparency on a light table.
I’m actually not sure it matters if the processor remembers or not – I think it’s the illusion of continuity provided by irregularity that does the trick. At some point, it’s no longer pure mathematics and more a question of physiology, psychology and perception.
I think you are right about that, Ming!
An irregularity introduced at the beginning ought to be more effective and give a more stable result than noise added later.
But even then there will still be _some_ pixel race, in the -60:ies we had the discussion of small/large grain.
( As for technology, I guess it’s just a matter of time until quantum dots used in the newest TV:s are “reversed” and used in sensors – and why shouldn’t they be scattered a little bit. When readout electronics allows it we might get a more random distribution of them.)
– – –
[ Remember the discusssion of “digital sound” as CD players emerged?
I think the reason was low quality of the ADCs that destroyed the illusion of continuity in the sound by adding distorsions.
Except in some more expensive players, which sounded much more natural. Cf. the ‘blad sensor?]
Interesting analogy. Perhaps it’s the randomness at the very lowest amplitudes/ highest frequencies that preserves the illusion of continuity – both in visual tones and sound…much as the real world does the same right up to the limits of our own sensory abilities.
Sounds right to me !
(as an amateur)
( Perhaps also? : Cf. the illusion of extra detail mentioned in some discussions on sensors without blurring filters.)
Since I decided to abandon my beloved analogue cams & jump head first into the exploration of the wonderful “new” world of digital photograph, Ming, I have been fascinated by this topic. The rants & screams & opinions & “information” flying around the net on this subject is about as coherent as the flight of atoms in the cosmos. Every now & then I come across an article – such as yours – which strives to point to a truth or a solution or a “way forward” – and out come the battering rams.
Having no technical skill whatsoever in the field of “pixels” and “sensors”, I am left to depend on people like yourself for “further & better particulars”.
At the moment I’ve reached a stage where I simply find the whole thing vaguely amusing, but really, very unhelpful. So I resort to taking photos and seeing whether I can improve my photography, while ignoring the “noise” coming from the near-constant chatter about pixels.
Along the way I’ve come across comments from pros, supporting a pixel count of 16MP on an FF – on the grounds they claim to get “better tone”, “more information within each pixel”, “better contrast”, or whatever – and that “more” pixels would in their view reduce the quality of their photographs.
I’ve also come across suggestions that “more” pixels = “smaller” pixels, which eventually no longer contain all the information that was recorded on them and which start to leak into adjacent pixels. Sounds reasonable – what do the real “technical experts” have to say on that one?
And claims by some parties that they now have a “better” sensor which can record and retain more information within a given pixel.
There are people out there with high level technical skills in this (not photographers, in other words – but PhD type brainy guys) who can probably address and answer all of these questions and issues. I’m sure there are, but so far I’ve only found one such article.
Then you come along with concrete evidence. And still the shouts don’t die away.
On top of all of this is the other issue you raise – printers. Not all photographs these days are printed – God forbid, as far as I’m concerned – being pursued by selfie sticks is quite sufficient punishment, without being subjected to a paper flood of their output !!!! However, when you look at printers, for most people those gigantic pixel numbers get crunched, and subjected to the limitations of the dot size in the inkjet printers most of us use. And sure, those dots leak around the paper.
I draw comfort from that process, as a matter of fact.
Suddenly, all those little squares or rectangles suddenly cease to exist and I am looking at a photo which more closely resembles the photographs I have been producing over the past 50 years. Suddenly, the sun is shining once again – the world looks good – and I can enjoy turning my attention back to simply improving my photography, and stop worrying about all this technical chatter.
At this point I should close, and insert an emoticon – but once again, I lack the skill to shove in a suitable one, so you’ll all have to guess which one I have in mind.
Execution and vision are independent up to a point. And then after that, the diminishing returns curve kicks in: one has to be sympathetic to the other to perfect the last details of an idea, much the same as McDonalds is food as is three star Michelin. You’ll feel full after both, there are no subtleties to understand in one, and you may well feel the latter isn’t even food at times. But it’s food, and the right thing at the right time satisfies. And so the same with photography. There is never a perfect fit, much the same as there is never completeness. Perhaps the remaining ambiguity is half the joy since we’re left free to let our own imagination satisfy us.
The upshot of all this is you can choose to be in the camp that’s always chasing more, or the camp that understands the capabilities/limits of ‘now’, and just makes pictures. I’m all for extracting more if possible – and if that translates to the audience – but the images should always come first.
Unfortunately this technical details in this are largely incorrect. The salient key words are probably “reconstruction filter” and “dithering” which deal specifically with these issues. Sampling theory makes it clear that regularly spaced sampling sites are in fact just fine, as long as you manage the output process correctly. The “discrete levels connected by straight lines” is a myth and always has been. A very widely believed myth, to be sure. The math is a little dense, but manageable to anyone with a firm hold on the first couple years of calculus and linear algebra.
A correctly handled digital signal can be and is, rendered in the final output, identical the the “smooth” analog input signal with these differences:
– high frequency above the Nyquist limit it eliminated
– a quantity of noise with amplitude equal to, I think, 1/2 of the least significant bit, is added
There are ways to push that quantization noise around to bands where it is less offensive, and with modern digital cameras the quantization noise is negligible compared to other sources of noise anyways. In particular, is, if anything, “smoother” than the input, as it may lack certain high frequency information.
Lyons is the canonical accessible resource, Oppenheim & Schafer is the bible.
There is an extent to which the above discussion is true when looking it pictures on a monitor, or printed on a very bad printer. These constitute “improperly handled signals” but in the end you typically wind up perceiving things more or less properly anyways.
From a pure mathematical sense, you’re right but only with the two caveats you just described. The problem is that’s never been the point of photography – as you and others have always been quick to point out when I get overly analytical. On top of that, those conditions are almost never met in practical application.
Firstly, sampling frequency isn’t always over Nyquist limit, and even if the recording system is capable of it doesn’t mean that the operator is – a slight bit of hand shake can greatly reduce effective resolving power – i.e. sampling frequency.
Secondly, the Nyquist frequency of the recording system is independent of the Nyquist frequency of the source signal. When that source signal is infinitely detailed to begin with, the limitation is always going to be in the recording system – and later in output.
Thirdly, think for a minute about how most images are viewed. Almost evry signal lands up being incorrectly handled in some way at every stage of the process, which results in a tangible loss by the time we hit output. There is always data loss in conversion because there are no perfect practical systems. In practical application, all signals will be improperly handled which means there has to be something to compensate for this if we’re going to escape endless increases in density and still feel a tangible disconnect between presented image and plausibility.
Finally, you’re forgetting the last input medium: the human eye. It becomes no longer pure mathematics and more a question of physiology, psychology and perception.
I could write a long and very boring discussion, and if you want me to I’ll take a crack at it tomorrow, but for now I am just not seeing how randomizing the sensel positions helps with any of those issues. It might, I suppose, create a slight uptick in perceived high frequency information, but at enormous expense, and I think it’s phase dependent anyways which makes it wildly unreliable in actual usage. It’s useless if turning the camera 0.25 degrees to the left destroys the effect completely.
Thank you, but not necessary. The point was and always has been one of perception. Simply: you consume an image with your eyes. I leave the audience to judge for themselves.
Can I just chime in and say, “thanks.” These are my favorite types of exchanges to read. Bravo.
Pleasure!
Hi Ming, As Andrew hinted at, the Shannon sampling and reconstruction theorem is worth to revisit, it explains how to digitize continuous signals and reconstruct them perfectly. The problem in photography is that the theoretical Nyquist sampling rate of a typical lens is beyond current 35mm sensors, which is where the not perfect dumb-down anti alias filters come in. Still, lenses do not provide infinite detail and do have a frequency limit, so proper sampling is possible, but – quick calculation – for a f/1.4 lens you’d need more than 5Gpixels, more for color. Of course, regular camera lenses are not up to this kind of performance, but some of them, in the center, can be astonishingly good, burn through the anti alias filter and make trouble. Film handles this much more graceful: with its random positioned and sized particles and high noise levels, it hides aliasing artefacts. Jagged lines will be randomly jagged, visually less disturbing, even nice… Well, my 2 cts!
I think we’re saying the same thing. It hasn’t got anything to do with mathematical resampling, it’s visual discontinuity and limitations the medium – and since it’s a photograph, what looks more pleasing to the end viewer.
Technologies such as the pixel-shift mode of Pentax K1 and the Super-Fine Detail of the soon-to-be-released SD Quattro is another way to get superior image quality with the same resolution.
“A new “Super-Fine Detail” mode brings out the full potential of the Quattro sensor by capturing seven different exposures with one shot and merging them for exceptional dynamic range (images can be extracted individually as well).”
We’ve had that option for a while already; slight movement of the camera between shots and manual blending/averaging in PS has a very similar result. One fundamental problem still remains for all such techniques: there’s no way to make this work with a scene which contains any moving elements. 🙂
Agree that these technologies have limited applicability, but nevertheless useful in some situations. Another approach is to stitch multiple images into one big image and then down sample.
Two words: Awe. Some! Very thought provoking. Thank you!
Thanks!
I think this resolution is just one aspect. Colors are another aspect. If you compare the Canon colors with the Fuji colors they are exaggerated, but I like the effect. Panasonic has less punchy colors, becomes boring after Fuji. Olympus and Fuji and Canon have great skin tones. Nikon is true to life…they say. Why all the difference? Lens sharpness is aspect. And so on. It is not just resolution. Flash fill in, bounced vs straight on. And then Post Processing, which can take hours. To what end?
Tonality is not independent of resolution: the more spatial resolution you have, the more accurately you can represent color or luminance transitions.