Last weekend, I did an article(Phone Height over Time) on phone heights and their change over time. This week, I will try to focus on another major metric for telecommunication user devices- Resolution. Additionally, I’d try to highlight challenges one may face to reach desired result from any data.

About Source Data

Same as previous week, we will be using data collected from GSMArena up to July’2017. Total device count at the start is 8,653. After web-scraping, the dataset at this point looks like below-



8,653 number of devices in raw data

Cleansing the Uncleansed

All looks good and shiny in Resolution column at first glance. Seems easy to extract the required info. Let’s dig down a bit, shall we?.


Not too shiny data

The data is not so “pixel-y” and shiny anymore. Older phones didn’t have resolution defined in pixels. Instead, they were using lines or characters to describe a display. In this analysis we are caring about only pixels. So, those lines and character displays must face the cleansing.

After cleaning and getting only pixel related info, our dataset looks like below. Note that, I extracted pixel dimensions in separate columns for easier data manipulation in future.



Looks good except for the “NA”s introduced. After another round of cleaning and discarding invalid results (NA mostly), our analysis dataset look like below-


Final dataset of 8,314 devices which contain relevant info

We ended up with 8,314 devices which have the info we need. It also highlights the importance of taming (cleansing) data found in the wild.

The Dilemma of Overnumbered Categories

So, we got our desired dataset and we are ready to plot resolution changes over the years. But another challenge appeared- due to rampant nature of phone resolutions, we ended up with 161 different resolution categories.


Categories overload: 161 different resolution types

Plotting all of them will certainly give more info but visualizations will be clumsy and overcrowded. So, I decided to divide the resolutions in 6 broad categories with little help from this Wikipedia page.


Category Lower.Limit.Megapixels. Upper.Limit.Megapixels.
1. Below nHD 0 0.22
2. nHD 0.23 0.51
3. qHD 0.52 0.91
4. HD 0.92 2.06
5. FHD 2.07 3.68
6. Quad HD 3.69 3.69+

So, a phone having a resolution of 1200x1920 (2.3 Megapixels) will be lumped in FHD category because it has got more pixels than FHD(1080x1920 or 2.07 Megapixels) but less than Quad HD (1440x2560 or 3.69 Megapixels).

At last…Visualization

With all challenges solved, our dataset looks like this-



Lets, first plot a stack diagram with number of devices in each category over the last 10 years. Obviously number of higher resolution devices increasing year by year due to drastic increase in network speed and availability of higher resolution media contents.

Another interesting insight is that, number of different types of devices released is decreasing for last two years. 2015 saw 15% decrease in device types compared to 2014 and 2016 saw 26% decrease compared to 2015. With half of 2017 gone, a similar decrease of ~25% over last year is expected. It indicates a sharp shift from traditional Quantity (Flooding) release schedule to Limited but Quality release schedule from manufacturers.


2015 saw 15% decrease in device types compared to 2014 and 2016 saw 26% decrease compared to 2015. With half of 2017 gone, a similar decrease of ~25% over 2016 is expected

To end the article, here is a graph depicting the share of devices by different resolution category over the years. As expected, HD and FHD devices are gaining momentum in recent years. But share of Quad-HD devices has remained same(5.7-5.9%) in recent years probably due to battery constraints and premium price tags to go with it.


As expected, HD and FHD devices are gaining momentum in recent years. But share of Quad-HD devices has remained same(5.7-5.9%)

End Notes

Yes, we are getting resolution hungry with HD and FHD devices are becoming more dominant. But moving to even higher resolution era(mass market wise) may not be as fast as it had been for FHD.

That’s it for today. Stay tuned and keep on commenting and sharing your views.

You can catch up with my analysis series on other topics in the following links:

If you got a data-related challenge worth solving, you can knock me in the links mentioned in profile section and may be we can help each other out.