# How Data Science can help choosing the right romper suit

Digitalization has taken a stronger hold in my family than in many companies. We organize ourselves on Threema and synchronize our shopping lists with Wunderlist. The only thing we haven't yet used is the wonderful world of predictive analytics – at least not yet. My family recently grew in size by one very sweet and very small person. While on parental leave, I decided to start playing with a predictive model that promises to solve a common problem faced by new parents.

## Never buy the wrong clothing size for your little ones again

The little guy is growing so fast that you can never be sure what size of clothes will fit him in a few months' time. You might come across a great snow suit at a flea market, only to find that it takes until mid-summer for your child to grow into it. That's a shame. But actually quite unnecessary. Because there is plenty of good, freely available data to help you determine fairly accurately what size of clothes your little one will need at any time in the future.

## Growth charts for boys and girls

Anyone with small children will be familiar with the children's growth charts that are widely used in pediatrician's offices. They plot the height and weight of children in 10-percent steps (percentiles). They allow you to draw conclusions such as: "40% of boys were smaller at birth than ours". (You can do the same for girls, of course, but girls and boys have separate charts because their growth curves are different). If you haven't come across these charts before, here is an example.

As a data scientist, I consider them a reliable prediction model for body size. A child born with a body size in the 40th percentile, for example, will normally continue to have a larger body size than 40% of children of the same sex as she continues to grow.

## Predicting clothing size with data science - how to do it

The predictions are based on the growth percentile curves. The procedure is quite simple. You work out the percentile ranking for your child's body size at birth. The calculation isn't based directly on the percentile score, however, but on what is known as a "Z-score" which, although related to the percentile score, is better suited to performing calculations. To do the calculation, you need not only those nice charts but also their equivalent numerical values. Fortunately, the American Center for Disease Control has made this information available on the internet.

The next step is to calculate the Z-score for body size for each month of the child's life. Now you have to carry out a few date calculations to work out the calendar dates upon which the birth month begins and ends (although to make things easier, we assume that all months are 30 days) – and the forecast is done. Oh yes, you then have to convert the predicted body size into a clothing size. It is good to remember that children's sizes in Germany always refer to the largest size in the range of sizes the garment will fit. A 62-size romper therefore fits children between 56 cm (the next size down) and 62 cm tall.

If you are to get the full benefit of the prediction, however, and perhaps even make predictions for other people whose only familiarity with Python is as a constrictor snake, you do need to put in a little work. Under the circumstances, this seemed like a good opportunity for me to become more familiar with Vega. For those who don't yet know Vega, it is – in my humble opinion – the upcoming framework for interactive visualization in data science and further afield. Forget ggplot, matplotlib and Bokeh; Vega is going to be the next big thing. It allows you to easily configure JavaScript-based visualizations in JSON, and besides the actual visualization, the framework also provides user interaction and data preparation functions.

But for now, it's worth knowing that you can actually implement the entire prediction model in Vega.

### The clothing size calculator is currently only available in German.

*You have more fun with the clothes size calculator with all browsers except the Internet Explorer ;)