It is our great pleasure to welcome you to this issue of the Proceedings of the ACM on Human Computer Interaction, the first to focus on the contributions from the research community Interactive Surfaces and Spaces (ISS). Interactive Surfaces and Spaces increasingly pervade our everyday life, appearing in various sizes, shapes, and application contexts, offering a rich variety of ways to interact. This diverse research community explores the design, development and use of new and emerging tabletop, digital surface, interactive spaces and multi-surface technologies.
The call for articles for this issue on ISS attracted 87 submissions, from all over the world. After the first round of reviewing, 26 (29.9%) articles with minor revisions were invited to the Revise and Resubmit phase, and 39 (44.8%) articles with major revisions for the next full PACMHCI ISS review cycle in 2021 (total of 65 articles, 74.7%). The editorial committee worked hard over the two iterations of the review process to arrive at final decisions. In the end, 25 articles (28%) were accepted. All authors of the accepted articles are invited to present at the ISS conference from November 8-11, 2020.
This issue exists because of the dedicated volunteer effort of 31 senior editors who served as Associate Chairs (ACs), and 146 expert reviewers to ensure high quality and insightful reviews for all articles in both rounds. Reviewers and committee members were kept constant for papers that submitted to both rounds.
Mobile users rely on typing assistant mechanisms such as prediction and autocorrect. Previous studies on mobile keyboards showed decreased performance for heavy use of word prediction, which identifies a need for more research to better understand the effectiveness of predictive features for different users. Our work aims at such a better understanding of user interaction with autocorrections and the prediction panel while entering text, in particular when these approaches fail. We present a crowd-sourced mobile text entry study with 170 participants. Our mobile web application simulates autocorrection and word prediction to capture user behaviours around these features. We found that using word prediction saves an average of 3.43 characters per phrase but also adds an average of two seconds compared to actually typing the word, resulting in a negative effect on text entry speed. We also identified that the time to fix wrong autocorrections is on average 5.5 seconds but that autocorrection does not have a significant effect on typing speed.
Autogrip is a new thimble that enables force feedback devices to autonomously attach themselves to a finger. Although self-attachment is a simple concept, it has never been explored in the space of force feedback devices where current thimble solutions require complex attachment procedures and often swapping between interchangeable parts. Self-attachment is advantageous in many applications such as: immersive spaces, multi-user, walk up and use contexts, and especially multi-point force feedback systems as it can allow a lone user to quickly attach multiple devices to fingers on both hands - a difficult task with current thimbles. We present the design of our open-source contraption, Autogrip, a one-size-fits-all thimble that retro-fits to existing force feedback devices, enabling them to automatically attach themselves to a fingertip. We demonstrate Autogrip by retrofitting it to a Phantom 1.5 and a 4-finger Mantis system. We report preliminary user-testing results that indicated Autogrip was three times faster to attach than a typical method. We also present further refinements based on user feedback.
In this paper, we explore input with wearables that can be attached and detached at will from any of our regular clothes. These wearables do not cause any permanent effect on our clothing and are suitable to be worn anywhere, thus making them very similar to badges we wear. To explore this idea of non-permanent badge input, we studied various methods to fasten objects to our clothing and organise them in the form of a design space. We leverage this synthesis, along with literature and existing products to present possible interaction gestures these badge-based wearables can enable.
The human ear is highly sensitive and accessible, making it especially suitable for being used as an interface for interacting with smart earpieces or augmented glasses. However, previous works on ear-based input mainly address gesture sensing technology and researcher-designed gestures. This paper aims to bring more understandings of gesture design. Thus, for a user elicitation study, we recruited 28 participants, each of whom designed gestures for 31 smart device-related tasks. This resulted in a total of 868 gestures generated. Upon the basis of these gestures, we compiled a taxonomy and concluded the considerations underlying the participants' designs that also offer insights into their design rationales and preferences. Thereafter, based on these study results, we propose a set of user-defined gestures and share interesting findings. We hope this work can shed some light on not only sensing technologies of ear-based input, but also the interface design of future wearable interfaces.
Gesture recognition plays a fundamental role in emerging Human-Computer Interaction (HCI) paradigms. Recent advances in wireless sensing show promise for device-free and pervasive gesture recognition. Among them, RFID has gained much attention given its low-cost, light-weight and pervasiveness, but pioneer studies on RFID sensing still suffer two major problems when it comes to gesture recognition. The first is they are only evaluated on simple whole-body activities, rather than complex and fine-grained hand gestures. The second is they can not effectively work without retraining in new domains, i.e. new users or environments. To tackle these problems, in this paper, we propose RFree-GR, a domain-independent RFID system for complex and fine-grained gesture recognition. First of all, we exploit signals from the multi-tag array to profile the sophisticated spatio-temporal changes of hand gestures. Then, we elaborate a Multimodal Convolutional Neural Network (MCNN) to aggregate information across signals and abstract complex spatio-temporal patterns. Furthermore, we introduce an adversarial model to our deep learning architecture to remove domain-specific information while retaining information relevant to gesture recognition. We extensively evaluate RFree-GR on 16 commonly used American Sign Language (ASL) words. The average accuracy for new users and environments (new setup and new position) are $89.03%$, $90.21%$ and $88.38%$, respectively, significantly outperforming existing RFID based solutions, which demonstrates the superior effectiveness and generalizability of RFree-GR.
One-handed Back-of-Device (BoD) interaction proved to be desired and sometimes unavoidable with a mobile touchscreen device, for both preferred and non-preferred hands. Although users? two hands are asymmetric, the impact of this asymmetry on the performance of mobile interaction has been little studied so far. Research on one-handed BoD interaction mostly focused on the preferred hand, even though users cannot avoid in real life to handle their phone with their non-preferred hand. To better design one-handed BoD interaction tailored for each hand, the identification and measure of the impact of their asymmetry are critical. In this paper, we study the impact on the performance of the asymmetry between the preferred and the non-preferred hands when interacting with one hand in the back of a mobile touch surface. Empirical data indicates that users' preferred hand performs better than the non-preferred hand in target acquisition tasks, for both time (+10%) and accuracy (+20%). In contrast, for steering tasks, we found little difference in performance between users' preferred and non-preferred hands. These results are useful for the HCI community to design mobile interaction techniques tailored for each hand only when it is necessary. We present implications for research and design directly based on the findings of the study, in particular, to reduce the impact of the asymmetry between hands and improve the performance of both hands for target acquisition.
Supporting many gestures on small surfaces allows users to interact remotely with complex environments such as smart homes, large remote displays, or virtual reality environments, and switching between them (e.g., AR setup in a smart home). Providing eyes-free gestures in these contexts is important as this avoids disrupting the user's visual attention. However, very few techniques enable large sets of commands on small wearable devices supporting the user's mobility and even less provide eyes-free interaction. We present Side-Crossing Menus (SCM), a gestural technique enabling large sets of gestures on %small interactive surfaces like a smartwatch. Contrary to most gestural techniques, SCM relies on broad and shallow menus that favor small and rapid gestures. We demonstrate with a first experiment that users can efficiently perform these gestures eyes-free aided with tactile cues; 95% accuracy after training 20 minutes on a representative set of 30 gestures among 172. In a second experiment, we focus on the learning of SCM gestures and do not observe significant differences with conventional Multi-stroke Marking Menus in gesture accuracy and recall rate. As both techniques utilize contrasting menu structures, our results indicate that SCM is a compelling alternative for enhancing the input capabilities of small surfaces.
Previous research demonstrated the ability for users to accurately recognize tactile textures on mobile surface. However, the experiments were only run in a lab setting and the ability for users to recognize tactile texture in a real-world environment remains unclear. In this paper, we investigate the effects of physical challenging activities on tactile textures recognition. We consider five conditions: (1) seated on an office, (2) standing in an office, (3) seated in the tramway, (4) standing in the tramway and (5) walking in the street. Our findings indicate that when walking, performances deteriorated compared to the remainder conditions. However, despite this deterioration, the recognition rate stay higher than 82% suggesting that tactile texture could be effectively recognized and used by users in different physical challenges activities including walking.
In artworks such as paper-cutting, it is important to combine the skill of the artist with the difficulty of the rough sketch to improve the skill. We developed a system to measure the distances and widths of cutting lines and patterning the pictures based on the steering law. We quantitatively evaluated the difficulty level of the pictures. We developed an interactive system to support the improvement of cutting pressure control, which is one of the skills for making paper-cutting. Furthermore, we analyzed users' psychological state when making a paper cutting based on the psychological flow state. The flow state is a highly focused state that tends to be shown when the user's skills balance with the difficulty of the task. On the other hand, since it is difficult to analyze skills and difficulty quantitatively, many researchers evaluate them by qualitative analysis. In this paper, we set "skill" as the cutting ability and "challenge" as the difficulty level of the picture. We evaluated the improvement of the user's skill and the flow state through quantitative analysis. We developed a questionnaire to evaluate the psychological state of a paper-cutting process based on the existing flow state questionnaire. In this paper, we describe the changes in skill improvement due to the combination of user skill and picture difficulty, with "skill" being the cutting ability and "challenge" being the difficulty of the picture.
In this paper, we investigated how "lying down'' body postures affected the use of the smartphone user interface (UI) design. Extending previous research that studied body postures, handgrips, and the movement of the smartphone. We have done this in three steps; (1) An online survey that examined what type of lying down postures, participants, utilized when operating a smartphone; (2) We broke down these lying down postures in terms of body angle (i.e., users facing down, facing up, and on their side) and body support; (3) We conducted an experiment questioning the effects that these body angles and body supports had on the participants' handgrips. What we found was that the smartphone moves the most (is the most unstable) in the "facing up (with support)'' condition. Additionally, we discovered that the participants preferred body posture was those that produced the least amount of motion (more stability) with their smartphones.
We propose a 3D-printed interface, CAPath, in which conductive contact points are in a grid layout. This structure allows not only specific inputs (e.g., scrolling or pinching) but also general 2D inputs and gestures that fully leverage the "touch surface." We provide the requirements to fabricate the interface and implement a designing system to generate 3D objects in the conductive grid structure. The CAPath interface can be utilized in the uniquely shaped interfaces and opens up further application fields that cannot currently be accessed with existing passive touch extensions. Our contributions also include an evaluation for the recognition accuracy of the touch operations with the implemented interfaces. The results show that our technique is promising to fabricate customizable touch-sensitive interactive objects.
Whilst new patents and announcements advertise the technical availability of foldable displays, which are capable to be folded to some extent, there is still a lack of fundamental and applied understanding of how to model, to design, and to prototype graphical user interfaces for these devices before actually implementing them. Without waiting for their off-the-shelf availability and without being tied to any physical foldable mechanism, Flecto defines a model, an associated notation, and a supporting software for prototyping graphical user interfaces running on foldable displays, such as foldable smartphone or assemblies of foldable surfaces. For this purpose, we use an extended notation of the Yoshizawa-Randlett diagramming system, used to describe the folds of origami models, to characterize a foldable display and define possible interactive actions based on its folding operations. A guiding method for rapidly prototyping foldable user interfaces is devised and supported by Flecto, a design environment where foldable user interfaces are simulated in 3D environment instead of in physical reality. We report on a case study to demonstrate Flecto in action and we gather the feedback from users on Flecto, using Microsoft Product Reaction Cards.
Extended Reality (XR) systems (which encapsulate AR, VR and MR) is an emerging field which enables the development of novel visualization and interaction techniques. To develop and to assess such techniques, researchers and designers have to face choices in terms of which development tools to adopt, and with very little information about how such tools support some of the very basic tasks for information visualization, such as selecting data items, linking and navigating. As a solution, we propose Flex-ER, a flexible web-based environment that enables users to prototype, debug and share experimental conditions and results. Flex-ER enables users to quickly switch between hardware platforms and input modalities by using a JSON specification that supports both defining interaction techniques and tasks at a low cost. We demonstrate the flexibility of the environment through three task design examples: brushing, linking and navigating. A qualitative user study suggest that Flex-ER can be helpful to prototype and explore different interaction techniques for immersive analytics.
Assembly procedures are a common task in several domains of application. Augmented Reality (AR) has been considered as having great potential in assisting users while performing such tasks. However, poor interaction design and lack of studies, often results in complex and hard to use AR systems. This paper considers three different interaction methods for assembly procedures (Touch gestures in a mobile device; Mobile Device movements; 3D Controllers and See-through HMD). It also describes a controlled experiment aimed at comparing acceptance and usability between these methods in an assembly task using Lego blocks. The main conclusions are that participants were faster using the 3D controllers and Video see-through HMD. Participants also preferred the HMD condition, even though some reported light symptoms of nausea, sickness and/or disorientation, probably due to limited resolution of the HMD cameras used in the video see-through setting and some latency issues. In addition, although some research claims that manipulation of virtual objects with movements of the mobile device can be considered as natural, this condition was the least preferred by the participants.
Interactive tabletops offer unique collaborative features, particularly their size, geometry, orientation and, more importantly, the ability to support multi-user interaction. Although previous efforts were made to make interactive tabletops accessible to blind people, the potential to use them in collaborative activities remains unexplored. In this paper, we present the design and implementation of a multi-user auditory display for interactive tabletops, supporting three feedback modes that vary on how much information about the partners' actions is conveyed. We conducted a user study with ten blind people to assess the effect of feedback modes on workspace awareness and task performance. Furthermore, we analyze the type of awareness information exchanged and the emergent collaboration strategies. Finally, we provide implications for the design of future tabletop collaborative tools for blind users.
While end users can acquire full 3D gestures with many input devices, they often capture only 3D trajectories, which are 3D uni-path, uni-stroke single-point gestures performed in thin air. Such trajectories with their $(x,y,z)$ coordinates could be interpreted as three 2D stroke gestures projected on three planes,\ie, $XY$, $YZ$, and $ZX$, thus making them admissible for established 2D stroke gesture recognizers. To investigate whether 3D trajectories could be effectively and efficiently recognized, four 2D stroke gesture recognizers, \ie, \$P, \$P+, \$Q, and Rubine, are extended to the third dimension: $\$P^3$, $\$P+^3$, $\$Q^3$, and Rubine-Sheng, an extension of Rubine for 3D with more features. Two new variations are also introduced: $\F for flexible cloud matching and FreeHandUni for uni-path recognition. Rubine3D, another extension of Rubine for 3D which projects the 3D gesture on three orthogonal planes, is also included. These seven recognizers are compared against three challenging datasets containing 3D trajectories, \ie, SHREC2019 and 3DTCGS, in a user-independent scenario, and 3DMadLabSD with its four domains, in both user-dependent and user-independent scenarios, with varying number of templates and sampling. Individual recognition rates and execution times per dataset and aggregated ones on all datasets show a highly significant difference of $\$P+^3$ over its competitors. The potential effects of the dataset, the number of templates, and the sampling are also studied.
This paper presents a multitouch vocabulary for interacting with parallel coordinates plots on wall-sized displays. The gesture set relies on principles such as two-finger range definition, a functional distinction of background and foreground for applying the Hold-and-Move concept to wall-sized displays as well as fling-based interaction for triggering and controlling long-range movements. Our implementation demonstrates that out-of-reach problems and limitations regarding multitouch technology and display size can be tackled by the coherent integration of our multitouch gestures. Expert reviews indicate that our gesture vocabulary helps to solve typical analysis tasks that require interaction beyond arms' reach, and it also shows how often certain gestures were used.
Software engineers routinely use sketches (informal, ad-hoc drawings) to visualize and communicate complex ideas for colleagues or themselves. We hypothesize that sketching could also be used as a novel interaction modality in integrated software development environments (IDEs), allowing developers to express desired source code manipulations by sketching right on top of the IDE, rather than remembering keyboard shortcuts or using a mouse to navigate menus and dialogs. For an initial assessment of the viability of this idea, we conducted an elicitation study that prompted software developers to express a number of common IDE commands through sketches. For many of our task prompts, we observed considerable agreement in how developers would express the respective commands through sketches, suggesting that further research on a more formal sketch-based visual command language for IDEs would be worthwhile.
Visual exploration of maps often requires a contextual understanding at multiple scales and locations. Multiview map layouts, which present a hierarchy of multiple views to reveal detail at various scales and locations, have been shown to support better performance than traditional single-view exploration on desktop displays. This paper investigates the extension of such layouts of 2D maps into 3D immersive spaces, which are not limited by the real-estate barrier of physical screens and support sensemaking through spatial interaction. Based on our initial implementation of immersive multiview maps, we conduct an exploratory study with 16 participants aimed at understanding how people place and view such maps in immersive space. We observe the layouts produced by users performing map exploration search, comparison and route-planning tasks. Our qualitative analysis identifies patterns in layoutgeometry (spherical, spherical cap, planar),overview-detail relationship (central window, occluding, coordinated) andinteraction strategy. Based on these observations, along with qualitative feedback from a user walkthrough session, we identify implications and recommend features for immersive multiview map systems. Our main findings are that participants tend to prefer and arrange multiview maps in a spherical cap layout around them and that they often rearrange the views during tasks.
This research establishes a better understanding of the syntax choices in speech interactions and of how speech, gesture, and multimodal gesture and speech interactions are produced by users in unconstrained object manipulation environments using augmented reality. The work presents a multimodal elicitation study conducted with 24 participants. The canonical referents for translation, rotation, and scale were used along with some abstract referents (create, destroy, and select). In this study time windows for gesture and speech multimodal interactions are developed using the start and stop times of gestures and speech as well as the stoke times for gestures. While gestures commonly precede speech by 81 ms we find that the stroke of the gesture is commonly within 10 ms of the start of speech. Indicating that the information content of a gesture and its co-occurring speech are well aligned to each other. Lastly, the trends across the most common proposals for each modality are examined. Showing that the disagreement between proposals is often caused by a variation of hand posture or syntax. Allowing us to present aliasing recommendations to increase the percentage of users' natural interactions captured by future multimodal interactive systems.
Coronavirus is thought to spread through close contact from person to person. While it is believed that the primary means of spread is by inhaling respiratory droplets or aersols, it may also be spread by touching inanimate objects such as doorknobs and handrails that have the virus on it ("fomites''). The Centers for Disease Control and Prevention (CDC) therefore recommends individuals maintain "social distance'' of more than six feet between one another. It further notes that an individual may be infected by touching a fomite and then touching their own mouth, nose or possibly their eyes. We propose the use of computer vision techniques to combat the spread of coronavirus by sounding an audible alarm when an individual touches their own face, or when multiple individuals come within six feet of one another or shake hands. We further propose using depth cameras to track where people touch parts of their physical environment throughout the day, and a simple model of disease spread among potential fomites. Projection mapping techniques can be used to display likely fomites in realtime, while headworn augmented reality systems can be used by custodial staff to perform more effective cleaning of surfaces. Such techniques may find application in particularly vulnerable settings such as schools, long-term care facilities and physician offices.
While the sport of golf offers unique opportunities for people of all ages to be active, it also requires mastering complex movements before it can be fully enjoyed. Extensive training is required to begin achieving any success, which may demotivate prospective players away from the sport. To aid beginner golfers and enable them to practice without supervision, we designed Subtletee---a system that enhances the bodily awareness of golfers by providing feedback on the position of their feet and elbows. We first identified key problems golf beginners face through expert interviews. We then evaluated different feedback modalities for Subtletee---visual, tactile and auditory in an experiment with 20 beginner golfers. We found that providing proprioception feedback increased the users' performance on the driving range, with tactile feedback providing the most benefits. Our work provides insights on delivering feedback during physical activity that requires high precision movements.
The dual Gaussian distribution hypothesis has been used to predict the success rate of target pointing on touchscreens. Bi and Zhai evaluated their success-rate prediction model in off-screen-start pointing tasks. However, we found that their prediction model could also be used for on-screen-start pointing tasks. We discuss the reasons why and empirically validate our hypothesis in a series of four experiments with various target sizes and distances. The prediction accuracy of Bi and Zhai's model was high in all of the experiments, with a 10-point absolute (or 14.9% relative) prediction error at worst. Also, we show that there is no clear benefit to integrating the target distance when predicting the endpoint variability and success rate.
We investigate the performance of one-handed touch input on the side of a mobile phone. A first experiment examines grip change and subjective preference when reaching for side targets using different fingers. Results show all locations can be reached with at least one finger, but the thumb and index are most preferred and require less grip change for positions along the sides. Two following experiments examine taps and flicks using the thumb and index finger in a new two-dimensional input space. A side-touch sensor is simulated with a combination of capacitive sensing and motion tracking to distinguish touches on the lower, middle, or upper edges. When tapping, index and thumb speeds are similar with thumb more accurate and comfortable, and the lower edge is most reliable with the middle edge most comfortable. When flicking with the thumb, the upper edge is fast and rated highly.
Visual graphics are widely spread in digital media and are useful in many contexts of daily life. However, access to this type of graphical information remains a challenging task for people with visual impairments (VI). In this study, we designed and evaluated an on-hand vibrotactile interface that enables users with VI to explore digital graphics presented on tablets. We first conducted a set of exploratory tests with both people with VI and blindfolded (BF) people to investigate several design factors. We then conducted a comparative experiment to verify that on-hand vibrotactile cues (indicating direction and progression) can enhance the non-visual exploration of digital graphics. The results based on 12 participants with VI and 12 BF participants confirmed the usability of the technique and revealed that the visual status of the users does not impact graphics identification and comparison tasks.