Objective: To develop a method enabling human-like, flexible supervisory control via delegation to automation. Background: Real-time supervisory relationships with automation are rarely as flexible as human task delegation to other humans. Flexibility in human-adaptable automation can provide important benefits, including improved situation awareness, more accurate automation usage, more balanced mental workload, increased user acceptance, and improved overall performance. Method: We review problems with static and adaptive (as opposed to "adaptable") automation; contrast these approaches with human-human task delegation, which can mitigate many of the problems; and revise the concept of a "level of automation" as a pattern of task-based roles and authorizations. We argue that delegation requires a shared hierarchical task model between supervisor and subordinates, used to delegate tasks at various levels, and offer instruction on performing them. A prototype implementation called Playbook® is described. Results: On the basis of these analyses, we propose methods for supporting human-machine delegation interactions that parallel human-human delegation in important respects. We develop an architecture for machine-based delegation systems based on the metaphor of a sports team's "playbook." Finally, we describe a prototype implementation of this architecture, with an accompanying user interface and usage scenario, for mission planning for uninhabited air vehicles. Conclusion: Delegation offers a viable method for flexible, multilevel human-automation interaction to enhance system performance while maintaining user workload at a manageable level. Application: Most applications of adaptive automation (aviation, air traffic control, robotics, process control, etc.) are potential avenues for the adaptable, delegation approach we advocate. We present an extended example for uninhabited air vehicle mission planning.
Objective: Effects of four types of automation support and two levels of automation reliability were examined. The objective was to examine the differential impact of information and decision automation and to investigate the costs of automation unreliability. Background: Research has shown that imperfect automation can lead to differential effects of stages and levels of automation on human performance. Method: Eighteen participants performed a "sensor to shooter" targeting simulation of command and control. Dependent variables included accuracy and response time of target engagement decisions, secondary task performance, and subjective ratings of mental workload, trust, and self-confidence. Results: Compared with manual performance, reliable automation significantly reduced decision times. Unreliable automation led to greater cost in decision-making accuracy under the higher automation reliability condition for three different forms of decision automation relative to information automation. At low automation reliability, however, there was a cost in performance for both information and decision automation. Conclusion: The results are consistent with a model of human-automation interaction that requires evaluation of the different stages of information processing to which automation support can be applied. Application: If fully reliable decision automation cannot be guaranteed, designers should provide users with information automation support or other tools that allow for inspection and analysis of raw data.
Objective: This study examined operators' capacity to successfully reallocate highly autonomous in-flight missiles to time-sensitive targets while performing secondary tasks of varying complexity. Background: Regardless of the level of autonomy for unmanned systems, humans will be necessarily involved in the mission planning, higher level operation, and contingency interventions, otherwise known as human supervisory control. As a result, more research is needed that addresses the impact of dynamic decision support systems that support rapid planning and replanning in time-pressured scenarios, particularly on operator workload. Method: A dual screen simulation that allows a single operator the ability to monitor and control 8, 12, or 16 missiles through high level replanning was tested on 42 U.S. Navy personnel. Results: The most significant finding was that when attempting to control 16 missiles, participants' performance on three separate objective performance metrics and their situation awareness were significantly degraded. Conclusion: These results mirror studies of air traffic control that demonstrate a similar decline in performance for controllers managing 17 aircraft as compared with those managing only 10 to 11 aircraft. Moreover, the results suggest that a 70% utilization (percentage busy time) score is a valid threshold for predicting significant performance decay and could be a generalizable metric that can aid in manning predictions. Application: This research is relevant to human supervisory control of networked military and commercial unmanned vehicles in the air, on the ground, and on and under the water.
Objective: This study assesses the influence of the auditory characteristics of alerts on perceived urgency and annoyance and whether these perceptions depend on the context in which the alert is received. Background: Alert parameters systematically affect perceived urgency, and mapping the urgency of a situation to the perceived urgency of an alert is a useful design consideration. Annoyance associated with environmental noise has been thoroughly studied, but little research has addressed whether alert parameters differentially affect annoyance and urgency. Method: Three 23 × 3 mixed within/between factorial experiments, with a total of 72 participants, investigated nine alert parameters in three driving contexts. These parameters were formant (similar to harmonic series), pulse duration, interpulse interval, alert onset and offset, burst duty cycle, alert duty cycle, interburst period, and sound type. Imagined collision warning, navigation alert, and E-mail notification scenarios defined the driving context. Results: All parameters influenced both perceived urgency and annoyance (p < .05), with pulse duration, interpulse interval, alert duty cycle, and sound type influencing urgency substantially more than annoyance. There was strong relationship between perceived urgency and rated appropriateness for high-urgency driving scenarios and a strong relationship between annoyance and rated appropriateness for low-urgency driving scenarios. Conclusion: Sound parameters differentially affect annoyance and urgency. Also, urgency and annoyance differentially affect perceived appropriateness of warnings. Application: Annoyance may merit as much attention as urgency in the design of auditory warnings, particularly in systems that alert drivers to relatively low-urgency situations.
Objective: The objective was to assess the validity of the Multiple Resources Questionnaire (MRQ) in predicting dual-task interference. Background: Subjective workload measures such as the Subjective Workload Assessment Technique (SWAT) and NASA Task Load Index are sensitive to single-task parameters and dual-task loads but have not attempted to measure workload in particular mental processes. An alternative is the MRQ. Method: In Experiment 1, participants completed simple laboratory tasks and the MRQ after each. Interference between tasks was then correlated to three different task similarity metrics: profile similarity, based on r2 between ratings; overlap similarity, based on summed minima; and overall demand, based on summed ratings. Experiment 2 used similar methods but more complex computer-based games. Results: In Experiment 1 the MRQ moderately predicted interference (r = +.37), with no significant difference between metrics. In Experiment 2 the metric effect was significant, with overlap similarity excelling in predicting interference (r = +.83). Mean ratings showed high diagnosticity in identifying specific mental processing bottlenecks. Conclusion: The MRQ shows considerable promise as a cognitive-process-sensitive workload measure. Application: Potential applications of the MRQ include the identification of dual-processing bottlenecks as well as process overloads in single tasks, preparatory to redesign in areas such as air traffic management, advanced flight displays, and medical imaging.
Objective: To develop work guidelines for wrist posture based on carpal tunnel pressure. Background: Wrist posture is considered a risk factor for distal upper extremity musculoskeletal disorders, and sustained wrist deviation from neutral at work may be associated with carpal tunnel syndrome. However, the physiologic basis for wrist posture guidelines at work is limited. Methods: The relationship of wrist posture to carpal tunnel pressure was examined in 37 healthy participants. The participants slowly moved their wrists in extension-flexion and radioulnar deviation while wrist posture and carpal tunnel pressure were recorded. The wrist postures associated with pressures of 25 and 30 mmHg were identified for each motion and used to determine the 25th percentile wrist angles (the angles that protect 75% of the study population from reaching a pressure of 25 or 30 mmHg). Results: Using 30 mmHg, the 25th percentile angles were 32.7° (95% confidence interval [CI] = 27.2°-38.1°) for wrist extension, 48.6° (37.7°-59.4°) for flexion, 21.8° (14.7°-29.0°) for radial deviation, and 14.5° (9.6°-19.4°) for ulnar deviation. For 25 mmHg, the 25th percentile angles were 26.6° and 37.7° for extension and flexion, with radial and ulnar deviation being 17.8° and 12.1°, respectively. Conclusion: Further research can incorporate the independent contributions of pinch force and finger posture into this model. Application: The method presented can provide wrist posture guidelines for the design of tools and hand-intensive tasks.
Objective: This paper analyzes some of the problems with error counting as well as the difficulty of proposing viable alternatives. Background: Counting and tabulating negatives (e.g., errors) are currently popular ways to measure and help improve safety in a variety of domains. They uphold an illusion of rationality and control but may offer neither real insight nor productive routes for improving safety. Method: The paper conducts a critical analysis of assumptions underlying error counting in human factors. Results: Error counting is a form of structural analysis that focuses on (supposed) causes and consequences; it defines risk and safety instrumentally in terms of minimizing negatives and their measurable effects. In this way, physicians can be proven to be 7500 times less safe than gun owners, as they are responsible for many more accidental deaths. Conclusion: The appeal of error counting may lie in a naive realism that can enchant researchers and practitioners alike. Supporting facts will continue to be found by those looking for errors through increasingly refined methods. Application: The paper outlines a different approach to understanding safety in complex systems that is more socially and politically oriented and that places emphasis on interpretation and social construction rather than on putatively objective structural features.
Objective: The study examined the adaptability of different types of process control training across changes in task and environmental stress. Background: The literature on training leads us to expect greater flexibility for system-based training, as opposed to procedure-based training. However, the stress literature suggests that knowledge-based strategies (making use of executive control) may be more vulnerable under stress conditions. Method: Two groups were given 6 hr of training on the Cabin Air Management System (CAMS), a complex, multilevel, PC-based process control task, emphasizing either system knowledge or use of procedures. They were then required to carry out the task for 3 hr (with noise during the middle 1 hr) across a range of both familiar and unfamiliar fault scenarios. Results: For the primary control task, the system-trained group performed better, especially for less familiar and complex faults. However, for lower priority tasks requiring executive control, procedure-trained operators performed better and were less impaired by noise. Conclusion: System training was more effective for managing unexpected task events, whereas procedural training was better under noise. The results are interpreted in terms of the rationale for instructing operators in the range of strategies required for effective process skills in complex work environments. Application: Training methodologies for safety critical applications should aim to develop skill in the use of both procedural and system knowledge strategies. Operators should be trained in the most effective deployment of these strategies during unfamiliar task events and environmental stress and given stress exposure training.
Objective: We investigated the influence of ongoing task display "compellingness" on attention allocation patterns and assessed its interaction with interrupting task salience and importance. Background: There are some concerns that the compellingness of flight deck tunnel displays renders the task they support more resistant to interruptions, thus preventing the pilot from noticing cues signaling the need to divert attention to other tasks. Methods: Forty pilots flew three curved approaches in a high-fidelity simulation using a synthetic vision system (SVS) display. In addition to the primary task of flying, during the last approach they were required to select the approach path on the basis of environmental information concerning weather. The display layout supporting the primary flight task (tunnel vs. baseline display), the nature of the cue signaling the need to divert attention to the path selection task (visual vs. auditory-visual cue), and the cost of not performing the secondary task were manipulated to investigate their influence on task prioritization. Results: The modality and priority of the cue affected the frequency of the switch to the secondary task. Furthermore, pilots flying with a tunnel display were more likely to detect the change in the weather and were easily interrupted by the secondary task when priority was high. Conclusion: Our results suggest that some of the concerns regarding the negative consequences of the compelling nature of the tunnel display may not be as pronounced as thought. Applications: This study highlights the utility of the tunnel display in improving flight safety.
Objectives: We investigated whether context or different speech rates could improve older adult performance on identification of synthetically generated words. Background: Synthetic speech systems can potentially improve the daily functioning of older adults. However, research must determine whether older adults can effectively implement current text-to-speech technologies, which few studies have examined. Older adults' sensory and cognitive declines may cause difficulties in identifying words in synthetic speech. Methods: Ninety-six participants (young, middle-aged, and older adults) identified auditory monosyllabic words (half natural, half synthetic) presented in isolation or at the ends of sentences. Participants heard speech at either normal or slower rates. Results: We found an interaction of age, context, and voice type and that slower speech rates worsened performance for all groups. Contrasts revealed that context reduced age differences, though only for natural speech. Hearing acuity was highly correlated with age and fully accounts for the interaction. Conclusions: Context improves performance for everyone in natural speech. However, whereas context improves performance for synthetic speech, it does not differentially reduce the age impairment for older adults. Slower speed generally impairs everyone's performance compared with the normal rate. Applications: Systems using synthetic speech should avoid presenting words in isolation, and rich contextual support should be consistently adopted. Synthetic speech fidelity must be improved significantly before becoming truly useful for older adult populations.
Objective: To investigate the effects of gender and gloves on hand fatigue, measured by the reduction in grip strength (ΔMVC), the shift in time needed to reach maximal voluntary contraction (MVC;ΔTMVC), and the maximal endurance time (MET). Background: Information about the effect of gloves on muscle fatigue seems to be less plentiful than that on hand strength, dexterity, sensation, and so on. Method: Ten male and 10 female volunteers served as participants. A task of sustained gripping until exhaustion was used as the designated fatigue protocol. Three gloved conditions were evaluated: bare-handed and single (Cotton 1) and double (Cotton 2) cotton gloves. Results: After completion of the fatigue protocol, a greater reduction in grip strength was found for men than for women in both magnitude and percentage. Male ΔTMVC was significantly greater; that is, there was more delay in time needed to reach the MVC for men than for women when the fatigue protocol was completed. MET was longer for men than for women. During gloved conditions, except for the ΔMVC, glove use did not change any of the other responses. Specifically, the gloved effect on ΔMVC depended upon gender. Conclusion: Men had a greater reduction in grip strength and took longer to reach the MVC than did women after the fatigue protocol. Except for decreasing the ΔMVC, whether or not gloves were worn did not change any of the other responses. Application: These data are useful for glove design, manufacture, and selection.
Objective: A prototype interface was developed to support decision making during tactical operations; a laboratory experiment was conducted to evaluate the capability of this interface to support a critical activity (i.e., obtaining the status of friendly combat resources). Background: Effective interface design strategies have been developed for domains that have primarily law-driven (e.g., process control) or intent-driven (e.g., information retrieval) constraints. However, design strategies for intermediate domains in which both types of constraints are equally critical, such as military command and control, have not been explored as extensively. The principles of direct perception, direct manipulation, and perception-action loops were used to develop a hybrid interface design strategy ("perception-action icons") that was incorporated into the prototype interface. Methods: A qualitative tactical simulation and an alternative interface (an experimental version of an existing U.S. Army interface) were developed. Participants used both interfaces to provide estimates of friendly combat resources for three different categories of information at three different echelon levels. Results: The results were unequivocal, indicating that the interface with perception-action icons produced significantly better performance. Conclusion: The perception-action icon design strategy was very effective in this experimental context. The potential for this design strategy to be useful for other intermediate domains is explored. Application: Actual or potential applications of this research include both specific interface design strategies for military command and control and general interface design principles for intermediate work domains.
Objective: We experimentally tested the degree that the size-weight illusion depends on perceptual conditions allowing the observer to assume that both the visual and the kinesthetic stimuli of a weight seen and lifted emanate from the same object. We expected that the degree of the illusion depended on the "realism" provided by different kinds of virtual reality (VR) used when the weights are seen in virtual reality and at the same time lifted in natural reality. Background: Welch and Warren (1980) reported that an intermodal influence can be expected only if perceptual information of different modalities is compellingly related to only one object. Method: Objects of different sizes and weights were presented to 50 participants in natural reality or in four virtual realities: two immersive head-mounted display VRs (with or without head tracking) and two nonimmersive desktop VRs (with or without screening from input of the natural environment using a visor). The objects' heaviness was scaled using the magnitude estimation method. Results: Data show that the degree of the illusion is largest in immersive and lowest in nonimmersive virtual realities. Conclusion: The higher the degree of the illusion is, the more compelling the situation is perceived and the more the observed data are in correspondence with the data predicted for the illusion in natural reality. This shows that the kind of mediating technology used strongly influences the presence experienced. Application: The size-weight illusion's sensitivity to conditions that affect the sense of presence makes it a promising objective presence measure.
Objective: Compare muscle activity and trunk stiffness during isometric trunk flexion and extension exertions. Background: Elastic stiffness of the torso musculature is considered the primary stabilizing mechanism of the spine. Therefore, stiffness of the trunk during voluntary exertions provides insight into the stabilizing control of pushing and pulling tasks. Methods: Twelve participants maintained an upright posture against external flexion and extension loads applied to the trunk. Trunk stiffness, damping, and mass were determined from the dynamic relation between pseudorandom force disturbances and subsequent small-amplitude trunk movements recorded during the voluntary exertions. Muscle activity was recorded from rectus abdominus, external oblique, lumbar paraspinal, and internal oblique muscle groups. Results: Normalized electromyographic activity indicated greater antagonistic muscle recruitment during flexion exertions than during extension. Trunk stiffness was significantly greater during flexion exertions than during extension exertions despite similar levels of applied force. Trunk stiffness increased with exertion effort. Conclusion: Theoretical and empirical analyses reveal that greater antagonistic cocontraction is required to maintain spinal stability during trunk flexion exertions than during extension exertions. Measured differences in active trunk stiffness were attributed to antagonistic activity during flexion exertions with possible contributions from spinal kinematics and muscle lines of action. Application: When compared with trunk extension exertions, trunk flexion exertions such as pushing tasks require unique neuromuscular control that is not simply explained by differences in exertion direction. Biomechanical analyses of flexion tasks must consider the stabilizing muscle recruitment patterns when evaluating spinal compression and shear loads.
Vidulich and Tsang (2007, this issue) offer a number of criticisms of the research, with two in particular focusing on (a) the fact that sequential processing of two tasks, resulting because of widely separated displays and discrete responses, may limit the contributions of multiple resources to time-sharing in the paradigm chosen; and (b) concerns about the structure of the three tasks in Experiment 2 - one pair mandating more continuous demands and therefore concurrent processing, and the other two pairs probably prohibiting it. In both my older (Sarno & Wickens, 1995; Wickens, 1980, 1984, 1991) and more recent (Horrey & Wickens, 2004; Wickens, 2002, 2004) writings on multiple resource theory, I have emphasized the separate and independent contributions of three components to predicting dual-task interference: (a) the total demand for resources (i.e., the difficulty or workload of the component tasks), (b) the similarity between the two tasks, and (c) the resource allocation policy.
Using a battery of tasks and a factor analytic technique to examine the patterns of hemispheric lateralization effects, Boles and Law (1998) elaborated on Wickens's (1984) model and dramatically extended the number of resource dimensions. Alhough Boles and Law ( 1998) argued that the Wickens (1984) model is overly restrictive, and it most certainly has limitations, they offered little information on how their expanded model would enhance predictability of time-sharing performance or how their MRQ tool might improve upon other multiple resource-based subjective instruments such as the Workload Index (W/INDEX; Riley, Lyall, & Wiener, 1994) and Workload Profile (Tsang & Velazquez, 1996) techniques.