top of page
  • Writer's pictureStephen Crouch

Tuning in to Air Traffic Control

Back in high school, I had a history class assignment to speak with a WWII veteran. The gentleman that I met had piloted cargo aircraft from various Pacific Northwest locations to bases on the Aleutian Islands - often at night and always off the coast of Alaska. He recounted a navigational approach that involved holding an altitude and adjusting his heading based on radio directions from ground tracking stations. About as simple as it gets! But needless to say, he put full faith in his radio and the network of ground stations every time he flew.

ATC as a Perception Input

VTI’s vision-based approach surveillance and navigation functionality has the potential to provide pilots with additional layers of situational awareness and safety as a standalone system. Additionally, information extracted from VTI’s vision system can be cross-referenced with GPS and ADS-B signals to reduce false alarms and provide additional context. But what about air traffic control (ATC) audio?

ATC communication is the primary layer of air traffic management and deconfliction. ATC offers a method to align context-dependent features of a vision system with the state of the airspace and routing instructions. With this in mind, we’re making use of modern audio and natural language processing tools on ATC communications to further improve pilot situational awareness and reduce cockpit workload across multiple phases of flight.

ATC communication presents several key challenges, including variable audio quality, specialized lingo, accents, phraseology, and mis-stated instructions. To see how modern tools dealt with these challenges, we looked at both landing clearance and taxi instruction - could ATC communication parsing be used to automatically generate understanding?

"Civilian air traffic controllers, Memphis" by Wikipedia user Zeamays

The process requires several key steps:

  1. Speech to text

  2. Text to semantic 

  3. Semantic utilization 

In both of our test cases, the problem is simplified with priors such as our aircraft’s tail number, available runways at an airport, or taxiway names and configuration. This information can be used to limit and refine reasonable outputs of the processing and to check for errors. 

In the following examples, identifying information from the input recordings has been anonymized.

Case 1: Landing Clearance

In this first example, we parse ATC communicating with several aircraft in series and look for landing clearance specific to callsign “American 1”.


Text from Audio transcript

[Tower ATC] Lemberg, on departure, you'll fly heading 170, climb and maintain 5000. 

[Lemberg] 170 on the heading on departure, Lemberg. 

[Tower ATC] Southwest, Washington Tower, I'll get you out after this next arrival. On departure, you'll fly heading 190, climb and maintain 5000. 

[Southwest] All right, fly heading 190, climb and maintain 5000, on departure, Southwest, and we're standing by. 

[American 1] Tower, Good afternoon, American 1 coming down the river for 19.


[Tower ATC] American 2, best forward speed, contact Potomac Departure, you have a wonderful day. 

[American 2] Best forward, switch to Departure, American 1.


[Tower ATC] American 1, how you doing? Washington Tower, runway 19, cleared to land, traffic departure prior to your arrival. 

[American 1] Cleared to land runway 19, American 1.

Semantic output

call_sign: "American 1"
- type: "landing"  
  runway: "19"

Final output

American 1 received clearance to land on runway 19. Pilot readback correct.

Case 2: Taxiway Instructions

Here we demonstrate a pilot receiving taxiway instructions.

Text transcript

[Pilot] Ground, hello.  Iceman, spot 28, ready to taxi.

[Ground ATC] Iceman, Tower, good day… Iceman, are you ready for runway 15?

[Pilot] We can take 15.

[Ground ATC]  Iceman, taxi runway 15, Papa, Juliet, November.

[Pilot] Papa, Juliet, November, runway 15.  Iceman 6.

Semantic output

call_sign: "Iceman"
- type: "taxi"
  taxiway: "P"
- type: "taxi"
  taxiway: "J"
- type: "taxi"
  taxiway: "N"

Final output

For taxi clearances, the taxi routing interpretation is depict on an airport diagram, typically displayed to pilots on a multi-function display (MFD) or electronic flight bag (EFB) to aid pilots in route execution, particularly at complex airports with multiple parallel and intersecting taxiways and runways. 


The examples presented here reinforce both the potential benefits and challenges of using modern audio to text and natural language processing tools in an aviation context. A key requirement on our roadmap is the scaled, automated testing of large audio data sets and the identification of metrics that provide insights into the stability and performance of multi-stage processing strategies. 

As we continue to develop capabilities in this area, we would appreciate community feedback on potential use cases for automated ATC communications parsing and ADS-B track comparison.



bottom of page