Use machine learning to estimate chance of moving
How many people will be moving house in the next two years? Policy makers consider this a key question in determining how many houses need to be built. CBS bases its estimates on the results from the WoOn survey on housing patterns and trends, which includes the question of whether respondents are considering moving within the next two years. This survey uses a questionnaire to gather information on respondents’ current living situation including satisfaction with the home and living environment. The survey also includes questions on desire to move and housing needs. The WoON survey is conducted every three years.
The Ministry of the Interior and Kingdom Relations, the commissioning party, has asked CBS to investigate alternative ways of collecting these data. Based on this request, CBS has investigated the possibilities of estimating the chance of moving based on register information using machine-learning technology. Our experimental pilot project has shown that it is indeed possible to replace the estimated chance of moving from the WoOn survey with the estimated change of moving from the registers.
The probability that an individual will move house in the next two years has been estimated for everyone included in the Personal Records Database (BRP). This work was performed based on CBS’ register information covering the period 1995-2016, including personal characteristics (such as age, gender and marital status) and household characteristics (such as type of household and household income). Information on past moves and regional characteristics have also been included in the models, as well as information about rental home versus owner-occupied home and any changes in income. Major life events from the period 1995-2016, such as having children, marriage, living together and divorce, were also included.
Not only the event itself was included in the models but also the question of how long ago this change took place. Finally, we examined how many of these changes occurred. For example, someone can get divorced and then remarried. Additional features that can influence the moving motives of people in employment, such as travel times and commuting distance, in addition to the type of employment contract, have not been included. The same applies to housing characteristics.
The models have been optimised and trained to map the relationship between all of these characteristics and actual known moving behaviour between 2013 and 2014 as accurately as possible. The models were then applied to the Dutch population as registered on 1 January 2015. Based on a person's register information, the models estimate how likely it is that this individual will move house within two years. Because we know which individuals actually moved in 2015 and 2016, we can measure how well the model estimates approximate reality.
A number of estimation methods have been tested to determine the best method for estimating chance of moving. This test included the usefulness of logistic regression, lasso regression, ridge regression, as well as random forest and survival models. These methods have the advantage that they can take many features at the same time into account and there is no limitation on the number of features selected in advance. The ridge regression with interaction effects turned out to be the most effective model. This model estimates the chance of moving just as well as the chance of moving based on the desire to move as demonstrated in the WoOn survey. Of the total group of people who were expected to move according to the chosen model, 39 percent actually moved house. If the group of people who actually moved house is taken as a starting point, we see that the model has accurately classified 60% as movers. Of the group of people who did not move, the model has classified 81 percent correctly as stayers.
A total of 32 characteristics and interactions between these characteristics have been included in the model. The most important characteristics for estimating the chance of moving, are: 1) someone’s age, 2) whether someone is a home owner or a tenant, also in reference to a person’s position in the household (for example single or part of a couple or child living at home), 3) the time that has passed since the last change in the household (for example having children, living together or getting divorced) and 4) the number of times a person has moved house in the past.
The average moving probability per person in 2015 and 2016 calculated using the model is 0.31 with a standard deviation of 0.17. This probability can be between 0 and 1 and can be converted to a percentage between 0 and 100 percent. The average probability can be interpreted as a 31 percent chance per person of moving within two years. According to the model, most people have a 20 to 50 percent chance of moving and some of them will probably move house. The study yielded a group of more than 373 thousand individuals with a probability of more than 90 percent. Of this group, slightly more than 30 thousand people have a 100 percent chance of moving. According to the survey, it is very likely that this group of people will move in the foreseeable future. On the other hand, nearly 2.4 million people have a probability of 0 according to the model and the chance that they will move is very small. In reality, almost 3.5 million people moved in this period.
A breakdown of the moving chances into population groups yields interesting insights. A one-person household, for example, has an average chance of 26 percent (relative lowest probability per type of household) and an unmarried couple with children a probability of 35 percent (relative highest probability). In addition, children living at home have an average chance of 39 percent; a relatively high chance of moving compared to other persons in the household. This includes children up to the age of four, school-age children, students in higher education and recent graduates in first-time jobs. At 25%, a parent in a single-parent household has the lowest chance of moving.
First-time buyers in the housing market have an average chance of 35 percent of moving within two years, while those moving up the housing ladder to a new rental or owner-occupied home have a moving chance of 28 per cent. In the interactive dashboard you can view the distribution of moving chances for various population groups.
Only anonymised data has been used for these analyses. It is not possible to trace back the information to specific individuals. CBS does not publish moving chances for individuals, only averages per population group.
The purpose of this innovative product is to demonstrate historical patterns. Policy makers, for example, can use these patterns to determine schedules and policy intentions. Based on this pattern recognition, it is possible to see which population groups have a relatively high chance of moving and whether this chance changes over the years. Thanks to these insights, new housing developments can be better adapted to the needs of these groups. An important question for possible follow-up research is: “Where do people want to move to?”
CBS is interested in your opinion on this study. Are these estimates useful for policy makers? And which other applications are conceivable based on this approach?