Back Industry News

How predictive analytics helps improve human capital management Posted on Mar 20 - 2017

Share This :

SAP SuccessFactors Workforce Analytics & Planning (WFA&P) provides a set of analytical and planning capabilities to simplify accessing data from multiple and disparate sources for Human Capital Management (HCM). WFA&P consists of Embedded Intelligence (Report Designer, Ad-Hoc Reporting, etc.), Workforce Analytics (Headlines, Benchmarks, etc.), and Workforce Planning (Forecasting, Risk Analysis, etc.). It can answer questions such as “Is turnover an issue?”, “What type of employee tend to leave voluntarily?”, and “Who are the top 10 highest impact-of-loss employees would leave?”

Currently, WFA&P is extending its analytical and planning capabilities with predictive analytics to provide additional insights for various HCM topics, such as Absenteeism, Career Development, Performance Management, and Terminations. The predictive analytics capabilities will add the “Why” and “How” facets to the existing analytics capabilities, such as the answers to the questions “What are the key drivers for turnover?”, and “What will make those employees stay?”

Employee Flight Risk

One of the hot topics is that of voluntary terminations (or employee flight risk), when employees decide to leave the company on their own. Like customer churn prediction, which predicts the likelihood of a customer to cancel a service offered by a business, employee flight risk prediction predicts the likelihood of an employee leaving a company voluntarily, by comparing and contrasting the employee details for those of who left on their own, with the employee details for those who stayed.

At WFA&P, flight risk prediction relies on two sources of information – the End of Period (EOP) Headcount and Voluntary Terminations measures. EOP Headcount is the actual number of people employed at the end of a reporting period and Voluntary Terminations represents the number of employees who terminated their employment voluntarily with the organization.

Employees details are represented by a set of workforce dimensions, which provides a 360-degree view of an employee. The dimensions are organized in categories, such as Biographical / Diversity (Age, Gender, Disability, Ethnicity, Generation, etc.), Compensation (Salary Range, Stock Options, etc.), Development (Key Position, Performance Rating, Potential Rating, etc.), Employment (Job Category, Employee Class, Employment Level, Grade, etc.), Succession (Critical Job Role, Succession Rating, Successor Readiness, etc.), and Tenure (Grade Tenure, Organization Tenure, Position Tenure, Time in Grade, etc.).

Automated Predictive Modeling

The flight risk prediction application uses SAP’s Automated Predictive Library (APL, aka KXEN) for its predictive algorithms and data mining environment. APL is based on the concepts of VC dimension and Structural Risk Minimization to keep the model simple and robust. It employs ridge regression, a non-parametric algorithm, to minimize the need for making assumption for data distributions. Its data mining environment facilitates automatic parameters tuning and model selection among other things, to free the users from performing these tasks manually.

The process of extracting historical headcount data to create a model and apply that model to the current employee headcount data has been automated and is shown in the figure below.

Prediction Performance

Many factors contribute to the quality of a predictive model. Two of the basic factors are attribute size and record size. If the number of attributes is small, then the model is likely to be biased. Further, if the number of records is small, the variance may be too high to separate one outcome from another. Therefore, having an adequate number of both attributes and records are essential to creating a high quality predictive model.

During the development, we have examined the relationship between the number of used attributes and the prediction performance in terms of predictive power, which translates to prediction accuracy: a ratio of correct predictions over the total number of records. We found that the predictive power is consistent with the number of attributes used to creating the model.

Data used for this analysis come from six of the WFA&P customers. They represent different businesses and industries, ranging from Energy to Retail. The data was drawn from workforce categories described earlier. As shown in the heat map below, the available attributes and records differ widely. In each cell, the ratio, which ranges from 0 to 1, represents the amount of attribute available for a given attribute category of a given customer. A 1.00 means the customer has all the attributes in that category, and a 0.0 means the customer has no attribute in that category. For example, the Energy customer has all of the Biographical / Diversity and Compensation attributes, but it does not have any Succession attribute.

As expected, the quality of the models created for each customer is affecting by the number of attributes used for creating predictive model. As shown in the bottom of the heat map below, the predictive power represented by the color and shade, is largely consistent with the number of attributes used to creating the model.

Prediction Insights

Besides model quality, another important quality measure of a predictive analytics application is its ability to provide insights that are easy to understand. In APL, the insights are represented as Influencers. As shown in the chart below, the influencers of a predictive model for a given customer include Grade Band, Organization Tenure, Performance Rating, and Job Function (Function View L3). Among them, Grade Band has the greatest effect on employees’ decision to leave, followed by Organization Tenure, and so on.


These insights can be further refined by examining the categories of each influencer, where categories are the values a given influencer can have. For example, the categories for Performance Rating include “High Performer”, “Mid Performer”, “Low Performer”, and “Not Rated”. As shown in the following chart, the categories “Low Performer” and “Not Rated” have a positive effect on flight risk; whereas, the categories “Mid Performer” or “High Performer” have a negative effect on flight risk. In other words, employees who are low performer are more likely to leave; whereas, employees who are mid-performer or high performer are less likely to leave.  View More


Get the Global Big Data Conference

Weekly insight from industry insiders.
Plus exclusive content and offers.