Mobility is the essential norm in human society. As today's wireless and mobile networking technology already brings 99% usable connectivity between our personal devices and (edge) cloud services, the remaining fragmented 1% connectivity scenarios, including but not limited to extremely lower power, high mobility, and massive access, which still face domain-specific networking challenges. More importantly, these minority cases are likely to turn over into the majority tomorrow as our global society is consciously evolving in the direction of energy and production efficiency improvement.
The theme of my research is to design and build a practical software-hardware system approach to address domain-specific problems of performance, reliability, and energy efficiency in next-generation networking and computer system where objects of interest are challengingly networked in the context of limited energy, high mobility, flash crowd, and their intersections (representing today's 1% and likely tomorrow's 99% mobile networking use cases), towards ubiquitous mobile connectivity.
Over the past few years, my research group has worked on research projects primarily focusing on two scenarios: i) backscatter communication (and networking) where massive tag objects need to stream their sensory data with high fidelity, and ii) extreme high mobility data networking where communicating endpoints are in high relative mobility (e.g., 200+ km/h). Another important branch in my group is working on making the best use of the ubiquitous wireless signals and mobile/IoT sensory data to extract human-centered contextual information for social good applications, together with its derivative yet mission-critical security problem.
As IoT is turning into reality at a quick pace, billions or even trillions of ambient objects are expected to gain Internet access in the near future to benefit everyday life. However, such massive scaling poses grand challenges to energy management in real-world deployment – today’s wireless technology for IoT (e.g., BLE, WiFi, ZigBee, Lora, and NB-IoT) typically operates at tens or hundreds of mW, which is a mismatch with long-term deployment without battery replacements. One of the fundamental technical approaches to address this problem is backscatter communication -- an active research front that effectively offers merely orders of uW connectivity to battery-free sensors by suppressing energy consumption by signal reflection instead of a transmission. My group has developed novel backscatter communication techniques (including optical, radio, and magnetic) tailored for different applications and scenarios (vehicular, logistics, AR gaming, etc.) to address their unique challenges in a real-world deployment.
Mainstream backscatter systems (such as RFID) operate at radio frequency. However, they not only inherit radio multipath problems but also interfere with ambient wireless data traffic in our daily use and exacerbate the “spectrum crunch” problem. To address these problems, we invent VLID, a new backscatter technology that operates in the visible light spectrum. The principle of VLID is to backscatter optical signals through modulating retroreflection. It equips a power-constrained tag device with a suite of retroreflector and LCDs (now we call it VLID tag): i) the retro-reflectivity ensure the optical signal is backscattered precisely toward the interrogator (VLID reader) and hence facilitates reliable (mobile) bidirectional connectivity in between; ii) LCD changes its transparency in response to the applied voltage and hence modulates the amplitude of the retroreflected (backscattered) signal. One of the key challenges of the visible light backscatter for IoT applications is to provide a data link with a rate beyond hundreds of bps, which is fundamentally limited by the low refreshing/switching rate (i.e., 100 – 240 Hz) of the commercial LCD shutter. We tackle this challenge with a dedicated link design including modulation, coding, and demodulation. We take advantage of the pulse (charging/discharging) monotonicity of the LCD shutter (as the optical modulator), and design a trend-based modulation. When coupled with the Miller code, the modulation achieves a 6 dB SNR gain compared with the existing PAM design to achieve the same data rate. We further design a code-assisted demodulation algorithm by formulating the optical non-linear channel into the Viterbi algorithm for demodulation to minimize bit errors. The link design improves the data rate over the OOK baseline by 8x to 1 Kbps.
As the pioneer of the concept of visible light backscatter communication, this work designs, prototypes, and evaluates VLID, the first-of-its-kind battery-free optical tag device (in analogy to RFID) and sub-mW IoT connectivity technology. See our [MobiCom'17] paper for technical details.
Autonomous driving is widely regarded as a key basis and enabler for the future of transportation, logistics, and smart cities. Currently, automotive and IT industries are investing heavily in key technologies of autonomous vehicles, including camera/LiDAR and their data pipeline asset for road context perception, understanding, and making local decisions. However, recent studies show that such passive imaging sensors fail to deal with dynamic road and traffic conditions such as accidents, road work, water, or icy surfaces due to their fundamental flaws in generalization to unseen/corner settings. We design RetroI2V, an alternative, complimentary, and equally important approach that invests in the road infrastructure and make the road intelligent to more accurately recognize the dynamic road context. RetroI2V is a novel, inexpensive, and easily deployable end-to-end passive vehicle-to-anything (V2X) coordination system solution built upon VLID. It renovates on-road retroreflective objects (e.g., traffic signs, roadblock, warning triangle, and reflective vest) into ultra-low power optical tags, and create passive visible light data links based on the light reflection from them to deliver road context information (e.g., emergency help request, reduced safe speed limit in bad weather, and customized location-based service) to vehicles. RetroI2V features optical frontend designs including late-polarization, complementary retroreflective signaling, and polarization-based differential reception frontend designs which are crucial to avoid flickering and achieve interference and multipath suppression, as well as a decentralized MAC protocol that incorporates excitatory carrier sensing for collision detection, a virtual ID generation mechanism for efficient address assignment and tag discovery for such fully decentralized and transient networks. Our prototyped system shows that RetroI2V supports up to 101m communication range, and achieves a sub-1% miss rate for inter-state highway scenarios in different weather conditions including sunny and rainy days.
This work presents a brand-new efficient and scalable V2X technology reduces 100x in cost in contrast to the state-of-the-art V2X solutions (e.g., C-V2X and DSRC) and won SenSys'18 Best Demo Award. See our [MobiCom'20] paper for technical details, demo video and MIT Tech Review for more information.
Considering an active VLC link or an RF backscatter link that can easily achieve more than tens or even hundreds of Kbps, a slow (i.e., sub-Kbps) uplink from VLID can easily become the bottleneck in a networked environment. From the channel capacity perspective, our key insight is that due to the nonlinearity resulting from LCM-based modulation, the available channel capacity is not fully utilized when the link has a sufficiently high SNR, i.e., the SNR is not efficiently traded off for data rate. Therefore, we break the whole LCD shutter panel into smaller pixels and rethink the advanced modulation scheme designs to push the channel capacity utilization in both time and polarization domains. We propose delayed superimposed modulation (DSM) and polarization-based quadrature amplitude modulation (PQAM) to turbo boost the data rate of VLID. Specifically, DSM interleaves and orchestrates the timing of charging pulse from multiplex liquid crystal modulation (LCM) pixels, constructs an inter-symbol interference channel, provides approximately linear capacity gain to the resolution of LCM pixel array, and fully utilizes the available bandwidth on this unconventional nonlinear optical channel. PQAM exploits the polarization manipulation capability of the LCD, constructs the orthogonal basis in analogy to the QAM scheme on two channels in the polarization domain rather than in the phase domain, and always provides a full data rate with an arbitrary relative orientation between two ends. When combined together, the two advanced modulation schemes realize 32x and 128x data rate gain over the OOK baseline in experimental and emulation results respectively.
This work break through the fundamental challenge (i.e., the low-throughput bottleneck of VLID tags) that originated in the device restrictions in the physical properties, and won MobiCom’19 Student Research Competition First Place. See our [SIGCOMM’20] paper for technical details.
State-of-the-art practical VLID systems cannot meet the low-latency requirement (sub-second level), especially in most IoT scenarios with sparse traffic and massive connectivity. We identify the key to solving the problem as a systematic low-latency design that minimizes the networking latency originating in both PHY and MAC layers. At the PHY layer, we enable and utilize the physical-layer concurrency to minimize the delay of a user waiting for other users to exit the channel. The possibilities lie in the “inaccessible” imperfections of hardware manufacturing and physical properties can be investigated and leveraged to realize system-level performance gains. We find the unexplored pulse diversity, which is specific to the tag context such as its placement and manufacturing imperfection, in the temporal sequences sent by LCM. We design a low-latency learning-based demodulation algorithm that handles the rank-deficient channel in the pulse-division concurrent transmission to achieve a low bit error rate and real-time demodulation. At the MAC layer, we design a centralized MAC protocol that coordinates the concurrency-based contentions and efficiently turns PHY-layer concurrency into low-latency multiple access. Our prototyped system shows that RetroMUMIMO supports up to 8 concurrent VLID uplinks and up to 92.0% latency compared with state-of-the-art VLID systems.
This work identifies and addresses the latency issue of IoT networks with massive connectivity originating in the lack of concurrency in the PHY layer. See our [SenSys'22] paper for technical details.
The bio-inspired spike camera breaks the conventional frame-based representation of videos by mimicking the sampling mechanism of primate fovea. It has several merits compared with conventional cameras, such as high temporal resolution (up to 40000 Hz) with common CMOS technologies, and free dynamic range. These merits, for the first time, enable practical high-speed (kHz-level compared with tens of Hz of conventional cameras) visual (to overcome the severe optical noise in outdoor mobile scenarios compared with photodiodes) reception of VLC signals. We thus select and utilize the bio-inspired spike camera as the receiver of VLID uplinks the enhance their performance aimed at the V2X networking scenario. The on-road retroreflective objects equipped with VLID tags can either be set up to broadcast information repeatedly or sense the nearby environment and send messages accordingly. The messages sent by tags are captured by the spike camera equipped on the passing vehicles, which may simultaneously leverage the spike camera for visual perception (for autonomous driving or advanced driving assistance). We design a series of algorithms to further process the output of the spike camera and demodulate the information sent by the tags, including an image reconstruction algorithm limiting the look-ahead depth of the spike streams for efficiency, an adaptive quantization algorithm for contrast enhancement, a demodulation algorithm that handles the asymmetric response of LCM, and a tracking algorithm to track the motion of tags by estimating the signal quality for each pixel. In-lab experimental results demonstrate that the system achieves a near-zero bit error rate when the tags transmit at 4.8 Kbps, and the links are robust under various distances and mobile scenarios.
This work demonstrates the feasibility of enhancing vehicular VLID/VLC networking with the novel spike cameras, and provides a promising approach towards integrated sensing and communication with visible light. See our [HotMobile'23] paper for technical details and more research opportunities.
Efficient and seamless interaction in Mobile Augmented Reality (MAR) relies on carefully-designed user interfaces to relieve the users from heavy mental stress and burden. Current wireless-based technologies for directional interaction, such as WiFi or BLE, are not capable of using the user's spatial context (i.e., location and orientation) to facilitate target selection, which brings unnecessary interaction effort (e.g., additional vision-based registration). The mainstream vision-based methods built on top also suffer from reduced reliability in a noisy environment, non-scalable wireless connections, and fixed form factors with limited data capacity, which fundamentally limit the efficiency of real-world applications where frequent and seamless target selection is necessary. We developed RetroFlexMAR: an optical-sensing solution that leverages visible light backscatter communication to serve for directional interaction with intelligent objects. It instruments objects with customized retro-reflective markers called Vitags, which can be implemented in a flexible form factor and are used to communicate with the camera on the smartphone by backscattering the flashlight beams. RetroFlexMAR exploits the intrinsic user spatial context to retain intuitiveness in the interaction process. Our prototype system shows that RetroFlexMAR could work reliably at a distance of up to 4 meters and a view angle of up to 100 degrees, and is able to achieve 6-DoF 3D tracking with an error as low as 1 cm on translation and 4.7 degrees on rotation. User studies show that RetroMAR improves the interaction time of MAR contactless control by at least two times compared to WiFi-based solutions.
This work demonstrates the merits of leveraging the physical characteristics of visible light to exploit the user's spatial context to facilitate HCI tasks. See our [TPCI'22] paper for technical details and more research opportunities.
Radio frequency identification (RFID) is a widely adopted technology in various domains such as supply chain management, inventory management, and access tracking, with a market value exceeding $10 billion in 2020. Despite its widespread usage, RFID systems are susceptible to a practical reliability issue known as cross-reading. Cross-reading occurs when unintended tags located outside the intended range are erroneously read. To address this challenge, our research group has developed two location-sensitive RFID systems.
The first system, NFC+ [SIGCOMM'20], leverages the distinctive power boundary exhibited by the magnetic field to enable boundary-sensitive tag reading. However, the reading speed of NFC+ is constrained by the limitations of the NFC stack. Therefore, we have deployed a second system called RF-Chord [NSDI'23]. RF-Chord utilizes parallel wideband localization techniques for UHF RFID tags, offering both location-awareness and high-throughput identification reading capabilities.
RFID (as a specific radio backscatter) technology has been widely regarded as a key enabler for smart inventory tracking since its invention. When applied in logistics networks, traditional UHF RFID systems suffer from the problems of miss-reading misaligned tags and cross-reading undesired tags due to complicated indoor multipath issues. As an alternative solution for reliable inventory tracking, we have reconstructed near-field communication (NFC) using magnetic fields. The magnetic signal decays faster with distance than UHF electromagnetic signal, and rarely experiences multipath reflection. We can leverage this property to prevent cross-reading undesired tags beyond the operating range, without the need for extra inventories for tag location estimation. Challenges in existing magnetic RFID systems include short working distances (less than 10 cm) and specific tag orientation. We overcome these challenges with physical and algorithmic techniques. To extend the working range, we act in opposition to the conventional practice and use high-quality factor coils. The consequent symbol distortion is avoided by separated TX and RX coils and a passive self-interference cancellation mechanism. For tags with various orientations, we learn from multi-antenna diversity and leverage multiple TX coils to broaden the angular coverage. Another technique we have introduced to make the system practical in real-world scenarios is the use of passive magnetic repeaters, which consist of only one-turn coils and can spontaneously repeat the reader's action without the need for a battery, helping both the TX and RX. Our microbenchmark experiments and large-scale warehouse tests show that we can achieve a 3m working distance, reduce the misreading rate from 23% to 0.03% and cross-reading rate from 42% to 0.
This work presents the first system that can do RFID tag inventory with sufficient accuracy and high reliability, which leverages magnetic field to ensure that it can read 99.9% tags within ROI and demonstrates high robustness for RFID unfriendly media (e.g., water bottles and metal cans). See our [SIGCOMM'20b] paper for technical details.
Attaching RFID tags on the products or packaging allows real-time visibility and better inventory management, enabling faster and more accurate picking and packing processes. However, cross-reading has a significant impact on reliability. The current location-based services, such as fingerprinting with reference tags or synthetic aperture-based localization, can reduce cross-reading but pay by throughput loss. We introduce a new RFID reader platform, RF-Chord, to support high-accuracy and high-throughput localization without modifying commercial tags. Our research has shown that commercial RFID tags can reflect radio waves from 700 MHz to 1100 MHz, even though they are originally designed for the narrow UHF ISM band (e.g., 902-928 MHz). RF-Chord platform works with an ISM-band reader and receives tag wideband responses (200 MHz) from multiple antennas for high time and spatial resolution. We invested a lot of effort in hardware and software to achieve full parallelism for high throughput. One RF-Chord board supports 180 tags reading per second across 200 MHz wideband channel estimation from 4 antennas. Paralleling means a huge amount of data (62 Gbps in our system), so we develop a software and hardware collaboration architecture that combines baseband IC, high-speed PCB, FPGA, CPU, GPU, and a series of hierarchical algorithms to enable real-time analog-to-digital conversion, channelization, data aggregation, demodulation, channel estimation, and localization. Our pilot experiments and theoretical analysis have shown that the multipath effect is the primary source of long-tail localization errors indoors. To address it, we propose a kernel-layer framework based on the hologram algorithm to process the channel response and design a multipath suppression algorithm tailored to the characteristics of the logistics scenario. RF-Chord enables an accuracy of 0.7 m under 99% of conditions and reduces the cross-reading rate from 2% to 0.002% in practical deployments at Alibaba's logistics business.
This work presents a high-reliable package check-in/out in a logistic network with 10x reliability improvement and 100x throughput improvement compared to the state-of-the-art RFID localization systems. See our [NSDI'23b] paper and project page for technical details and open source resources.
Extreme mobility has become a norm rather than an exception in our daily life, ranging from today's highway and high-speed railways to tomorrow's drone swarm and orbiting satellites in the context of integrated space and terrestrial networks. However, today's mobile Internet is not yet ready for delivering seamless quality of experience (QoE), primarily limited by its local viewpoint of end-to-end network topology, reactive behavior to unexpected network fluctuation and deterministic modeling encountering stale network measurement, which is no longer fit for the extreme mobility scenarios. To address these challenges, we leverage the key insight that extreme mobility often comes with movement trajectory pre-planning and networking performance predictability in a sense, follow the design principles of cross-layer and event-driven proactive scheduling, and build practical networking (sub)system middleware covering congestion control, flow control, and multipath scheduling elements, towards improved reliability and performance.
Recently, the rapid development of high-speed rails (HSRs) has dramatically changed the way people commute for medium-to-long-distance travel. For instance, a train traveling above 300 km/h potentially provides a more efficient way of door-to-door transportation than an airplane. While such high mobility brings great transportation efficiency, it also poses unprecedented challenges in delivering seamless Internet service for onboard passengers from the trackside broadband radio (e.g., LTE/5G) connectivity in a bottom-up fashion – from error-prone L1/L2 connectivity to misguided TCP. Specifically, as will be shown, the increasing mobility level poses several new challenges: it degrades the link quality as the Doppler spread increases, increase the BER and reduces the PHY data rate, and hence throttles TCP throughput; from the handover perspective, handover not only become more frequent but also are more likely to fail because of the unreliable handover control signal transmission and the tighter timing budget for handover completion due to the train’s extreme high mobility. To address these challenges, we start with performing a systematic TCP-LTE/5G cross-layer measurement, design a data-driven multipath transmission proxy solution based on the insights from the measurement study, and finally explore unique opportunities of mobile-edge collaborative content mobility management in the future Internet architecture.
We conducted the first large-scale active-passive measurement study of TCP performance over LTE on HSR and performed an in-depth cross-layer analysis to reveal the performance in commercial HSR networks as well as identify several performance inefficiencies. Based on the 1732.9 GB of data collected over 135,719 km of trips, we found that the extreme mobility of HSR not only effectively degrades the performance of two representative TCP variants - BBR and CUBIC, across all metrics including throughput, RTT, loss rate, and bytes-in-flight, but also incurs more frequent handover and link disconnections. To better understand the impact of LTE disconnection on the upper layer data plane transmission stall, we designed a disconnection-centric TCP stall diagnosis tool MobiStallDiag. The tool first synchronizes the LTE disconnection and TCP stall event traces from different system timestamps based on packet events, then examines the relationship between LTE disconnection and TCP stall in terms of cooccurrence and duration, and further conducts a comparative cause analysis based on whether the stall is associated with disconnection or not. Our disconnection-centric stall analysis shows that: 1) successful handover has less than half in both probabilities of causing a stall and caused stall duration in comparison to the other three types of long disconnection; 2) stall duration is at least hundreds of milliseconds longer than the disconnection itself; 3) successful handover is more likely to create a delay or lost ACK, while the other types of long disconnection tend to cause more packet loss and out-of-order delivery events. Furthermore, we expanded the scope of our study to include both LTE and 5G in a large-scale measurement campaign on a high-speed railway route operating at 350 km/h. Besides revealing the aforementioned key characteristics of 5G and LTE in extreme mobility, we developed a taxonomy of handovers in both technologies and conducted a link-layer latency breakdown analysis. Our findings uncovered various deficiencies in user equipment and radio access network that impeded seamless connectivity and hindered the optimal utilization of 5G's high bandwidth: 1) commercial 5G struggled to achieve its sub-millisecond latency goal due to untimely resource allocation in base stations; 2) prolonged handover duration caused by longer processing delays in user equipment contributed to a larger throughput drop at handovers for 5G compared to LTE.
This work emphasized the importance of developing dedicated protocol mechanisms that can adapt to extreme mobility and highlighted the necessary steps toward the evolution of cellular networks. This work has been published in [MobiCom’19] and [JSAC’20], and [SIGMETRICS'22].
Our cross-layer network performance measurement results on high mobility environments re- veal several key insights, including predictable handover (failure) patterns, highly variant network performance, carrier connectivity diversity, and occasionally excessive timeouts. To turn these observations into performance improvement opportunities, we root in taking ad- vantage of the heterogeneous network paths from multiple mobile carriers, and tackle this problem at both algorithm and system levels. Our key insight is: the idea of optimizing transport performance by more accurately modeling the network works poorly in the highly fluctuating HSR network, due to the complexity of the network condition and large errors in RTT/throughput measurements. In this work, we instead take a data-driven approach to identify four events (1 in the link layer, 3 in the transport layer) that once occur, we could confidently tell that some specified optimization would be beneficial. Based on this idea, we develop our composable (modularized) multipath scheduling framework. The framework allows event-triggered schedulerlets (scheduling modules) to shape the scheduler's behavior by manipulating its input and output sets. We also build four schedulerlets in response to the four identified events and integrate them with the MPTCP default scheduler, resulting in our multipath scheduler dedicated to the HSR network. From a scheduling algorithm perspective, our design includes: 1) location-aware network disconnection prediction and TCP stall avoidance, 2) active network metric probing on idle network paths for agile available path discovery and utilization, 3) tail-aware opportunistic paths switching to accelerate the session tail transmission, and 4) aggressive cross-flow retransmission on detecting multiple timeout events on packets that avoids extremely-high delivery time caused by path failures. From the system design perspective, we implement the system entirely in the userspace as a middleware that: 1) reuses existing TCP sockets for performance guarantee, and 2) is flexible to integrate cross-layer context information including location, cellular event, and flow statistics. Experimental results from WiFi system deployment on the Beijing-Shanghai line show that our full-fledged Polycorn system outperforms MPTCP by 57% in goodput for single session bulk download, and 45% in instant messaging delivery time and 49% in coefficient of variance for performance and fairness enhancement for up to 30 users.
This work pioneers the concepts of schedulerlet and the composable multipath scheduling framework, It also presents the first practical and deployable multipath transmission system dedicated to boosting the mobile Internet performance for all the HSR passengers, and has been actively deployed and evaluated on the onboard LTE gateway (CPE) on the Beijing-Shanghai HSR trains. See our [NSDI'23a] paper for technical details.
Vehicle-to-cloud video streaming is a core functionality to enable a broad array of new applications, ranging from in-vehicle entertainment streaming and gaming to even more challenging mission-critical tasks. The fundamental challenge in realizing such applications is how to continuously support high-bitrate low-latency video streaming over highly fluctuating cellular links as a vehicle drives: firstly, packet delivery delay of the traditional ARQ-based transport protocols has a long-tailed pattern because of repeated packet losses; secondly, the bandwidth-hungry video streaming application challenges the available bandwidth of the cellular interfaces, effectively forbids extensive usage of redundant traffic; finally, “dead spots” of cellular networks are common and inevitable, but the video streaming application must work properly in such cases. We address these issues with our connectivity-as-a service solution, CellFusion. CellFusion synergizes two fields: multipath transport and network coding: firstly, CellFusion builds a novel hardware-software system that aggregates multiple heterogeneous (multiple carriers, both 4G and 5G) cellular network resources to survive “dead spots”. Secondly, CellFusion introduces XNC, our network coding-based multipath transport on top of QUIC. XNC leverages QUIC-Datagram as an unreliable medium and integrates multipath features. It reuses as much as possible the QUIC transport features (e.g., congestion control, encryption, traversal of middleboxes) that already exist. On top of the base layer, XNC introduces a partially reliable transport mechanism with QUIC-based random linear network coding, QoE-aware loss detection, and opportunistic one-shot recovery. In a nutshell, XNC quickly detects video loss based on a QoE-aware policy and maximizes its recovery probability with opportunistic one-shot recovery by retransmitting enough random linear combinations (equations) of lost packets to utilize all paths’ instantaneous spare capacity opportunistically. Experiential results on 100 vehicles for over 6 months shows that, XNC reduced video packet delay by 71.53% at the 99th percentile versus 5G. At 30Mbps, CellFusion achieved 66.11% ∼ 80.62% reduction in video stall ratio versus state-of-the-art multipath transport solutions with less than 10% traffic redundancy.
This work presents the first connectivity-as-a service solution that enables in-the-wild high-quality realtime video streaming from a vehicle to the cloud. It also evaluates the idea of combining network coding with transport protocols in a production system for the first time. CellFusion is now currently used by Alibaba in its IoT wireless connection solution (XLINK). See our [SIGCOMM’23] paper for technical details.
Over the years, we have seen several examples of mobile multipath deployments by operators and mobile application vendors. Despite the availability of multipath solutions, there are few news articles documenting deployment experiences, and even fewer discussing the costs and challenges associated with large-scale commercial deployments. To fill this gap, we collaborated with a major phone vendor to understand the commercial deployment costs of mobile multipath transport. Previous research and deployments demonstrated the superior performance of MPTCP. As a result, we partnered with a major online video platform in China to enable MPTCP for video traffic in December 2018. We found that deploying MPTCP was more challenging than expected, mainly due to business considerations rather than technical difficulties. In this paper, we share our experiences during the development, deployment, and operation phases. After one year of operation, our deployment was retired in December 2019 because we could not identify a viable business model to generate revenue. We reviewed our experience with MPTCP and identified the key challenge as the need for an end-to-end solution that coordinates multiple network parties to jointly support MPTCP. This is challenging because the various parties have different and non-overlapping business incentives. This circumstance led us to explore a new direction for mobile multipath transport deployment. We observed that HTTP is the primary protocol used for mobile application traffic in the network we studied in China. HTTP is highly compatible with the existing infrastructure, such as CDNs and web caches. Therefore, we shifted our approach from MPTCP to Multipath HTTP (MPHTTP). In our work, we introduce a mobile system service called Fleety, which enables transparent multipath transport for all other Internet parties. We launched Fleety in September 2019. Currently, Fleety supports 142 device models, including smartphones and tablets, and 156 popular applications, including social media, video, games, news, and cloud storage services in China. We believe our design can lower the deployment barrier and increase the adoption rate of multipath transport.
This work shares our experience in deploying MPTCP and MPHTTP to nearly 10 million mobile devices. This work presents the first mobile system service that enables multipath transport without the involvement of other network parties, including the application, the server, and the middlebox. We showed that MPHTTP can lower the deployment bar and can immediately benefit the applications at scale. See our [MobiCom'23a] paper for details.
Due to the limited transmission power budget and the often adverse signal fading conditions in embedded and implanted environments, IoT uplinks are highly susceptible to cross-technology interference (CTI). Previous approaches to mitigating CTI which rely on MAC/PHY designs lack generality and perform poorly in the presence of wideband CTI devices such as Wi-Fi networks and RF jammers which feature high transmission power, aggressive spectrum use, and extensive channel utilization. We introduce RF-SIFTER, a technology-agnostic approach to mitigating wideband CTI for IoT uplinks. RF-SIFTER leverages bandwidth gap to perform cross-technology beamforming, enabling significant suppression of wideband CTI without relying on the demodulation or prior knowledge of IoT and CTI signals. RF-SIFTER is designed as a layer-0.5 that is decoupled from the wireless architecture, allowing transparent integration into gateways of existing and future IoT technologies without modifying the MAC/PHY. Extensive experiments conducted on an FPGA-based software radio prototype at two typical IoT carrier frequencies (915 MHz and 2.4 GHz) show that RF-SIFTER can improve the SINR of IoT uplink signals by up to 29 dB and enhance packet delivery ratio by 2× to 5× for popular IoT technologies such as ZigBee, BLE, and RFID, even under challenging conditions such as the high-power wideband CTI of 802.11ac networks and a commodity RF jammer that employs a proprietary modulation scheme.
This work introduces a technology-agnostic approach to mitigate wideband CTI for IoT uplinks by exploiting the bandwidth gap between IoT and wideband signals. It features a transparent Layer-0,5 design that enables RF-SIFTER to be integrated into IoT gateways without modifying existing PHY/MAC. See our [MobiCom'23b] paper for more details.
Wi-Fi direct transport provides versatile connectivity that enables convenient data sharing and improves the productivity of mobile end users. It not only improves data transport efficiency by reducing protocol overhead and backhaul traffic but also prevents potential data privacy leakage to the cloud. However, as today’s smartphones are capable of near-Gbps wireless data rates, current transport schemes do not efficiently utilize the available bandwidth in this single-hop environment. Our further investigation reveals three key reasons: 1) the current reliable delivery mechanism that uses per-packet ACKing policy exacerbating the channel contention in Wi-Fi links. 2) the widely deployed congestion control algorithms is unsuitable for the one-hop single-flow scenario as it exhibits unnecessary startup phase or overreaction to the lossy wireless link. 3) the flow control mechanisms in exiting transport schemes are inefficient in achieving line rate transmission in Wi-Fi Direct links as they are unaware of some on-path buffers. We notice a distinctive characteristic of peer-to-peer direct data transmission that can be exploited to achieve a high link utilization. Specifically, we are able to monitor the state of each individual buffer in the whole packet life-cycle because the two communicating devices are the only entities involved. In this paper, we present SMUFF, a file transfer service that aims to improve Wi-Fi Direct transport throughput to line rate by orchestrating the on-device buffers. We model the transport data path as a series of linearly connected buffers, and consider the traffic as a fluid that sequentially traverses each buffer. Our core idea is to identify and address the bottleneck component within this data path. To maximize throughput, we maintain an appropriate backlog of data in the bottleneck buffer. Our evaluation against other transport schemes shows that SMUFF achieves up to 88.7% and 91.8% throughput improvement for 802.11ac and 802.11ax, respectively. It improves the link utilization by up to 22.6% while reducing 12% CPU usage and 37% energy consumption, compared to the state-of-the-art solution.
This work presents the first queue-based solution tailored for direct Wi-Fi transmission. We present a file transfer service that is dedicated to Wi-Fi direct towards practical line rate. Our system maximizes throughput by deriving the optimal send rate according to the buffer states. See our paper [NSDI'24a] for more details.
Real-time communication (RTC) applications like video conferencing or cloud gaming require consistent low latency to provide a seamless interactive experience to users. However, it is non-trivial to provideconsistently high bitrate and low latency network service. We partenered with a major cloud gaming service provider in China and conducted large-scale measurement on network performance for RTC applications Our measurement revealed that in Wi-Fi networks, the delay of the wireless hop can inflate due to its fluctuating nature, making it difficult to achieve consistently low tail latency. On the other hand, while cellular paths can be leveraged to alleviate the impact of wireless fluctuation of Wi-Fi paths, our user study revealed that it is crucial to constrain cellular data usage while using multipath transport. Furthermore, we also found that while the network characteristics of different users vary greatly, it appears to remain stable for individual users for a period of time. Therefore, to address the challenge of reducing long tail latency by utilizing cellular paths while minimizing cellular data usage, we design a multipath transport system called Augur tailored for mobile RTC applications. Augur captures user characteristics by deriving state probability models and formulates the equilibrium into Integer Linear Programming (ILP) problems for each user session to determine the opportunity of frame retransmission and path selection. Augur was deployed in the server cluster of a cloud gaming service provider in January 2023 and has served millions of users. It can achieve up to 66.0% reduction in frame delivery tail latency and 99.5% reduction in frame stall rate with 88.1% decrease in cellular data usage compared to other multipath transport schemes.
This work presents the first mutlipath transport system tailored for mobile RTC applications which achieves high bitrate, consistently low latency, and low cellular data cost. It also brings new view on real-time video delivery with mutlipath transmission and concerns on real-world deployment. See our paper [NSDI'24a] for more details.
Reasoning about Network Traffic Load Property at Production ScaleIn the dynamic landscape of services like cloud computing, search, and video, leading enterprises are increasingly focused on developing and managing sophisticated Wide Area Networks (WANs) to seamlessly interconnect their data centers. A pivotal aspect of this process is verifying network traffic load complies with various specifications, such as keeping link utilization below 80% during network changes, crucial for WAN reliability and availability. Drawing from our experience with Alibaba's extensive WAN, we have crafted a verification system to model traffic behavior and proactively detect traffic load violations within the WAN. Consultations with network operators highlighted essential requirements for this verification system: R1) It must support diverse traffic behaviors under protocols like BGP, IS-IS, PBR, SR, and static routes, crucial for the WAN's routing and traffic engineering. R2) The system should efficiently analyze traffic over a time period, not just at a single point, to accommodate network changes that can span several hours and involve billions of flows. R3) It should facilitate failure-tolerance analysis, allowing operators to conduct real-time what-if scenarios for potential router and link failures. Addressing these needs was challenging, particularly in modeling complex traffic behaviors and maintaining verification efficiency. To overcome these challenges, we developed a three-pronged approach. First, we introduced the Traffic Distribution Graph (TDG), capable of modeling equal-cost multipath (ECMP), packet rewriting, and tunneling required by various routing protocols. Second, we created an algorithm based on TDG for efficient and accurate traffic distribution simulation across billions of flows and extended periods. Third, we implemented an incremental traffic simulation method, computing an incremental TDG and simulating only the differential traffic distribution, thus avoiding the need for full network traffic distribution simulations. This system has been instrumental in the daily verification of the WAN for over a year, effectively preventing service disruptions due to traffic load violations.
This work is pivotal in highlighting the importance of traffic load verification in global-scale WANs. As the first reported system of its kind for verifying network traffic load in a production WAN, it has been successfully deployed in Alibaba's WAN, preventing losses amounting to tens of millions of dollars. For a comprehensive technical overview, please refer to our [NSDI'24b] paper.