Injury Severity among Young Drivers: A combination of Parametric and Non-parametric Analysis

Young drivers among the group recorded highest fatalities index in road traffic crashes. The objective of this study was to identify factors affecting injury severity of young drivers using 5-years crash data (20082012) in Sabah, Malaysia. This study used a combination of parametric and non-parametric analysis to allows the specification of nonlinearities and interactions in addition to main effects. The results indicate that crashes on nighttime, federal roads and involved with a single-vehicle are positively associated with injury among young drivers. Interestingly, municipal roads, female drivers, crashes on the roundabout and T/Y junction are less likely to involve injury. A higher-order interaction suggests that not-at-fault young drivers involved with out-of-control or hit the object are more likely to be severe. On the other hand, young passenger car drivers involved in overturn and sideswipe collisions are negatively associated with the injury. It was also found that young drivers with driving too close behaviour are less likely to injure when involved in rear-end collisions for passenger car and four-wheel drive. Findings of this study will help relate authorities to design well-targeted restrictive measures in reducing the severity level of young drivers in traffic crashes.


INTRODUCTION
Road traffic crashes are a leading cause of death especially to people aged 5-29 years. In 2018, the World Health Organization (WHO) has declared road traffic crashes are a number one killers to this group (WHO, 2018). In Malaysia, this group represents 30% of total fatalities in a road traffic crash in the year 2013 (JKJR, 2014). Specifically, in Sabah, drivers aged less than 25 years recorded the highest proportion involved in road traffic crashes and contributed to more than half of fatalities (see Table 1). This issue needs to be explored further in avoiding the loss of valuable human resource for future development.
Research on young drivers is not something new in road safety (e.g. Dissanayake and Lu, 2002;Huang and Winston, 2011;Ismail et al., 2016;Dissanayake, 2004). Young drivers have been identified poor in safety performance. In addition, compared to adult drivers, young drivers have a little technical ability to control a vehicle in safe. Dissanayake and Lu, (2002) reveal that young drivers are inexperienced, risk-taking behaviour and immaturity and at greater risk of exposure. Ismail et al., (2016) surveyed on 127 students with mean age 22 to identify the relationship between personality traits and aggressiveness among young Malaysian drivers. They found young drivers tend to underestimate their driving ability. They suggest that the implementation of a personality test to reduce the road rage phenomena and aggressive driving should be considered in the future. Huang and Winston (2011) recommended developing driving skills, expertise and competencies including psychomotor, cognitive and perceptual proficiencies to fulfil the gap between safety limit and the freedom of explore in driving among young drivers. ISSN : 2615-2312(ONLINE) ISSN : 2615-1596 Previous research identified few of factors that influence the injury severity of young drivers. Dissanayake, (2004) developed models to predict crash severity of young and older drivers in single-vehicle crashes. They identified that frontal impact increase severity of older drivers than young drivers. In addition, night time crashes, speeding behaviour, the existence of a grade or curve also influenced of crash severity. Clarke et al. (2010 ) analyse 1184 fatal vehicle occupants in the UK. They found that most of the fatal crashes along bend segments are the majority involved with young drivers. Although there is some research on young drivers, much more remains to be unknown about how crashes among young drivers are different from middle and older aged groups especially in middle-income countries like Malaysia. The objective of this study was to identify the characteristics of crashes, road, driver and vehicle factors associated with injury-producing crashes among young drivers. In Malaysia, all crash victims are needed to make a police report for claim purpose using POL27 form. However, not all the variables are recorded in the system especially for property damage only crashes. The variables that have been used in this study are time of day, day of the week, season of the year, school season, collision type, crash type, road type, road geometry, driver gender, driver errors and vehicle type.
A total of 48,080 crashes involved young drivers occurred within the study period. In Malaysia, driver severity was categorized into four; fatal, serious injury, slight injury and property damage only (PDO). Due to large differences between PDO and other categories of injury, twolevel of injury was created in this study; severe (fatal, serious and slight injury) and non-severe (PDO). Afterwards, crash severity is referring to the injury severity of young drivers.

b. Methodology
Two-step modelling technique has been used with a combination of decision tree and logistics regression. This method has been applied in road safety research (e.g. Washington, 2000;Haque et al., 2016;Rusli et al., 2018). A decision tree is a nonparametric method, which identifies the possible relationships among explanatory factors and driver injury severity. The Chi-Squared Automatic Interaction Detection (CHAID) data mining algorithm has been used in this study. However, the non-parametric method is suffering from Type I errors. The identified group from decision tree are converted to indicator variables to develop a statistical interference.
Binary logistics was used to examine the relationship between explanatory variables and response variable. The logit was the natural logarithm of the odds that the response variable Y was severe (Y=1) versus non-severe (Y=0) as shown in Eq. 1: where P is the probability of driver injury severity involved in crashes, is the independent variable and  is the model coefficient directly determining the odds ratio. Table 2 presents the summary statistics of the variables included in this study. The severe crashes involved with young drivers was higher for crashes that occur during daytime (61.8%) compared to night time (38.4%). In the week, weekdays represent a higher proportion of severe crashes involved young drivers. The highest proportion of severe crashes also recorded on crashes during dry seasons and school days.

RESULTS AND DISCUSSION
Out of 8 collision types, a head-on collision was a dominant recorded higher proportion of severe crashes (39.3%). It is followed by angle and right-angle side collision (28.8%) and rearend collision (13.5%). 'Out-of-control', sideswipe, overturn, vehicle-animal/pedestrian and vehicle-object collisions respectively recorded 11.7%, 3.1%, 1.6%, 1.1% and 1.0% severe crashes among young drivers. 62.1% of severe crashes involved with more than one vehicle.
In this crash data set, three road types were included in the analysis; state road, federal road and municipal road. The highest proportion of severe crashes occurred on the federal road (55.9%). State road recorded 997 severe crashes represents 40.0% of total severe crashes during the study period. In term of road geometry, crashes on straight road section recorded a higher proportion of severe (63.8%). Bend section recorded 21.7% of severe crashes and followed by T/Y junction with 11.1%.
Male young drivers represent the highest group involved in severe (85.3%) and nonsevere (76.1%) crashes. Six driver behaviours were included in this analysis with one variable represent not-at-fault drivers. Out of these, not-at-fault drivers represent the highest group involved with severe crashes with a proportion of 71.1%. Crashes involved in speeding behaviour has been identified recorded a higher proportion of severe crashes (12.6%) and followed by dangerous turning behaviour (6.4%). Passenger car made up most crash-involved vehicle types (62.3% of non-severe crashes and 48.6% of severe crashes). Four-wheel drive represents the second type of vehicle was recorded higher severe and non-severe crashes, with corresponding proportion respectively 20.6% and 21.2%.  Figure 1 shows the decision tree for the crash's severity among young drivers in Sabah Malaysia. The decision was validated using 10-fold stratified cross-validation in which on each cycle nine-tenths of the data were used to train the decision tree and the remained one-tenth was used to measure the fitness. The estimated decision tree correctly classified 94% of instances, using 33 leaves for a total tree size of 53 nodes. At the top of the tree, the collision type represents the highest information gain. The estimated decision tree classified the driver severity by segmenting the dataset into 33 smaller and more homogenous groups. The group statistics that indicate the classification rules for crash severity are reported in Figure 1. For example, the statistics of branch 1 suggest that 14.6% of overturn or sideswipe collisions involved lorry or bus in daytime resulted in injury and 85.4% did not result in reported injury. Likewise, branch 33 identifies out of total 48,080 crashes involved young drivers, 0.6% crashes involved collision between passenger car with animal or pedestrian during the daytime. Out of these, 54.2% were crashes that resulted in injury and 45.8% were crashes that did not result in reported injury. Following the branches of the estimated decision tree, 33 higher-order interaction terms were ISSN : 2615-2312 (ONLINE) ISSN : 2615-1596 created and numbered as shown in the brackets at the bottom of each tree branch. These 33 higher-order interaction terms coupled with main effect variables were tested in the binary logistics model. Table 3 shows the estimation of logistics regression for young driver's injury severity. Out of eleven selected factors, eight of them found statistically significant influencing injury severity of young drivers including time of the day, crash type, collision type, road geometry, gender, driver errors and vehicle type. In addition to these main effect variables, the logistics model also identified 16 higher-order interaction variables affecting the severity of young drivers including interaction variable 4, 6,7,12,15,16,17,19,20,22,26,28,29,30  Among 33 higher-order interaction variables found in decision tree analysis, respectively eight of them are found positively and another eight are negatively associated with injuryproducing crashes among young drivers. Interaction variable 6 is found positively associated with injury-producing crashes, with corresponding odds about 8.9 times (OR8.925, 95% CI:1.936-41.154) higher. The same observations also found for interaction variable 7, interaction variable 12 and interaction variable 15, with corresponding odds respectively 5.0 times (

a. Crash characteristics
The results show that crashes during night time are found to be more severe. This finding is in line with previous studies (e.g. Dissanayake & Lu, 2002;Clarke et al., 2010). Single-vehicle crashes found to be more to produce injury when involved with young drivers. Kockelman & Kweon, (2002) suggest that reckless driving behaviour increase the probability of severe singlevehicle crashes involvement among young drivers compared to middle-age drivers. Dissanayake & Lu, (2002) explain that single-vehicle crashes are more severe because most of the case, the vehicle leaves the road and overturns of hits a roadside object such as a tree or a pole. Among seven collision types included in this study, angle and right-angle side, head-on, sideswipe and overturn are found statistically significant and positively associated with injury severity crashes. Previous studies also identified head-on collision increase the probability of severe because of this collision involved with the vehicle from the opposite direction (Hosseinpour, Shukri Yahaya, Farhan Sadullah, Ismail, & Reza Ghadiri, 2016). The impact of this collision is higher because of ISSN : 2615-2312 (ONLINE) ISSN : 2615-1596 the different speed between two opposite vehicles. Time for a reaction is also very limited in this collision. However, Dissanayake, (2004) identified that frontal impact only significantly influencing the severity of older drivers and not to the young drivers.

b. Road factors
Crashes that occurred on federal roads found a higher probability of severe than state roads. On the other hand, crashes on municipal roads found not reported to injury. The explanation behind this is a different standard on these three types of roads. In addition, different agencies are involved in the management and maintenance of these roads under different sources of budget. Roundabout and T/Y junction were found significant and negatively associated with crashes producing injury. These two types of intersection have less severe conflict point compared to other types of intersections.

c. Driver factors
The results show that the odds of female drivers involved with injury-producing crashes were lower than male drivers. This finding is in line with previous research on young drivers in Finland (Laapotti & Keskinen, 1998). They suggested that male is more apt to engage in risky driving behaviour. On the other hand, female drivers more cautious in driving and follow the traffic rules. Driver 'not-at-fault' is the highest proportion group involved with crashes producing injury. Traffic light violation among the driver errors that increase the probability to involve in severe crashes. This problem is very serious in Malaysia. Government has been taken many actions to account this behaviour including installation of a red-light camera. The effectiveness of the redlight camera in Malaysia has been identified by previous research (e.g. Jamil, Shabadin, & Rahim, 2014;Kabit, Sabihin, & WH, 2016).

d. Vehicle factors
The involvement of heavy vehicle in a crash involved young drivers is positively associated with crash severity. The main reason for this is the size and weight of heavy vehicle increase the probability of severe crashes. In addition to heavy vehicle, crashes involved van and four-wheel drive also found increasing the probability of severe among young drivers. The design of van is one of the reasons behind this situation. Van was designed with less protection from frontal impact. Four-wheel drive has a higher gravity center and less stable than passenger car (Keall, Newstead, & Watson, 2006).

e. Higher-order interaction
Interaction variable 4 includes higher-order interaction between passenger car, overturn and sideswipe and road type. Results suggest that the crash severity is lower if passenger car involved in overturn or sideswipe collisions on state and municipal roads. On the other hand, notat-fault drivers involved with out-of-control or hit an object found positively associated with injuryproducing crashes and represents by interaction variable 6 and 7. Interaction variable 12 represents young drivers of a heavy vehicle with behaviours of speeding, dangerous overtaking, careless driving, traffic light violations or not-at-fault involved in rear-end collision resulted in more severe in crashes. The same observation also found when drivers of a passenger car and fourwheel drive with speeding, dangerous overtaking, careless driving and traffic light violation behaviours involved with rear-end collision (interaction variable 15). However, if a careless driver ISSN : 2615-2312(ONLINE) ISSN : 2615-1596 on passenger car or four-wheel drive involved in rear-end collision, the crashes do not result in reported injury. Van drivers only positively associated with injury-producing crashes in rear-end collision during night time (interaction variable 17). For angle and right-angle side collision, night time crashes involved passenger car found positively associated with injury-producing crashes (interaction variable 20). Interaction variable 22 includes higher-order interaction between collision type, time of the day and road geometry. Results suggest that the crash severity is higher if the angle and right-angle side collision happen along the bend section during the daytime.
Although head-on collision found positively associated with injury producing crashes, head-on collision on federal roads involved bad driver behaviours (e.g. speeding, driving too close, dangerous turning, dangerous overtaking, careless driving and traffic light violation) found negatively associated with injury-producing crashes (interaction variable 26). Head-on collision on state roads during night time also found negatively associated with injury-producing crashes (interaction variable 28). Head-on collisions on municipal roads are associated with crashes that do not result in reported injury.
Vehicle-animal/pedestrian collision on night time are found negatively associated with injury-producing crashes. In addition, during the daytime, vehicle-animal collision also found negatively associated with injury-producing crashes for a passenger car.

CONCLUSION
Five years of crash data in Sabah, Malaysia was used to identify factors influencing injury severity for crashes involved young drivers. A combination of parametric analysis (decision tree) and non-parametric analysis (binary logistics) was applied in this study offers more insights into crash severity through higher-order interaction variables. Results from the decision tree identified 33 high-order interactions between main effect variables. The binary logistics was performed to examine these interaction variables together with main effect variables in influencing injury severity of young drivers.
Out of eleven main effect variables, eight of them found statistically significant to explain injury severity of young drivers including time of the week, crash type, collision type, road type, road geometry, gender, driver errors and vehicle type. In addition to the main effect variables, 16 interaction variables found statistically significant influencing injury severity of young drivers.
Crash during night time, single-vehicle, federal roads and traffic light violation found were found more severe when involved young drivers. In terms of collision type, angle and right-angle side, head-on, sideswipe and overturn found increase the probability of injuries. The crash involved heavy vehicles (lorry and bus) was more severe than a passenger car. The same observation was also found on crashes involved with four-wheel drive and van.
On the other side, few of the variables found contributed to the less severe crash including municipal road, roundabout and T/Y junction and female drivers. Out of 6 driver errors contributed to the crashes, three of them found reported less severe crashes including speeding, driving too close and dangerous overtaking.
In terms of high-order interaction variables, eight of them found significantly associated with crashes producing injury and another eight were found reported contradictory. Findings of this study will help related authorities to design well-targeted restrictive measures in reducing the severity level of young drivers in traffic crashes.