Malloy Documentation
search

Malloy has a special query operator that builds _Dimensional Search Indexes for sources. A Dimensional Search Index is a table with 4 columns. Dimensional indexes are useful for a variety of things including filtering suggestions and LLMs.

When filtering data, you might know a term, but not necessarily which column in the one of the join data contains it. Indexing the data on field names and high cardinality fields let's you qucikly find the term and the associated value.

Indexing could be used by LLMs to find the interesting column/term mapping in the data set.

  • fieldName - The path to the column in the source

  • fieldValue - The dimensional value for the field (or range if fieldType is not a string)

  • fieldType - Type type of the column.

  • weight - a weighting to use as to the importance the distinct dimensional value. Defaults to cardinality of the field.

Simple Example

We're going to take the airports table and index it. The results are an un ordered list of distinct fieldName/fieldValue pairs appear in the table. The weight, in this case is the number of rows that partciular occurs on.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
}
QUERY RESULTS
fieldNamefieldValuefieldTypeweight
codeWV39string1
code47Istring1
codeWY62string1
site_number25878.5*Astring1
site_number55030.*Astring1
site_number25584.12*Hstring1
site_number25496.122*Astring1
site_number25544.*Astring1
site_number26138.*Astring1
site_number26273.61*Astring1
site_number26172.6*Hstring1
site_number26465.3*Astring1
site_number26274.7*Astring1
site_number26441.1*Astring1
site_number26130.*Astring1
site_number26412.*Astring1
site_number27637.*Astring1
site_number27132.21*Astring1
site_number27187.*Astring1
site_number27061.01*Astring1
site_number27643.5*Astring1
site_number27170.01*Hstring1
site_number26693.04*Hstring1
site_number26888.*Astring1
site_number27873.01*Hstring1
site_number27881.11*Astring1
site_number27386.*Astring1
site_number26948.11*Hstring1
cityFREDERIKSTED ST CROIXstring1
cityWILBURstring3
cityLA CENTERstring2
cityWENATCHEEstring3
cityCENTER ISLANDstring1
cityQUILLAYUTEstring1
cityIRMAstring1
cityEAU CLAIREstring5
cityARBOVALEstring1
cityCODYstring4
countyFORESTstring3
full_nameSTOUFFER GRAND BEACH RESORTstring1
full_nameSHELBURNEstring1
full_nameMC MAHON FIELDstring1
full_nameSKATTER CREEKstring1
full_nameBURNT RIDGE AIRSTRIPstring1
full_nameTHE BOEING COMPANYstring1
full_nameENUMCLAWstring1
full_nameSTUPEK FARMSstring1
full_nameJOHN DEGERMAN FIELDstring1
full_nameNORTHWOODS HOSPITALstring1
full_nameFORT ATKINSON MEMORIAL HOSPITALstring1
full_nameJOSHUA SANFORD FIELDstring1
full_nameROXstring1
full_nameFIVE CORNERS AIRWAYSstring1
full_nameWAUSAU HOSPITAL CENTERstring1
full_nameSTEVENS POINT MUNIstring1
full_nameKENOSHA REGIONALstring1
full_nameCENTER WHEELING PARKING GARAGEstring1
full_nameWALKER VIstring1
certAU 04/1981string1
codeVT34string1
code9WA3string1
code5WI0string1
code38WIstring1
code05WIstring1
codeWS04string1
codeW19string1
codeWY37string1
codeTHPstring1
site_number26245.9*Ustring1
site_number26402.01*Astring1
site_number26455.8*Hstring1
site_number26374.1*Astring1
site_number26275.*Astring1
site_number27200.*Astring1
site_number27388.*Astring1
site_number27896.6*Astring1
cityWOODINVILLEstring2
citySUNNYSIDEstring1
cityKEWAUNEEstring2
cityDRESSERstring1
cityST CROIX FALLSstring1
full_nameE.A.DEEDS FARMstring1
full_nameLAKE UNIONstring1
full_nameJEFFERSON GENERAL HOSPITALstring1
full_nameCHILCOTT FARMSstring1
full_nameDEER HAVEN RANCHstring1
full_nameMC MANUS HOONCH-NA-SHEE-KAWstring1
full_nameIBERLIN STRIPstring2
full_nameWARREN-SUGARBUSHstring1
codeORFstring1
code75WIstring1
codeWV33string1
codeWV05string1
site_number26477.*Astring1
site_number26361.*Astring1
site_number27562.01*Hstring1
site_number27291.*Astring1
site_number27677.01*Astring1
site_number26899.*Astring1
site_number26660.5*Astring1
[
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "WV39",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "47I",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "WY62",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "25878.5*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "55030.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "25584.12*H",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "25496.122*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "25544.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26138.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26273.61*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26172.6*H",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26465.3*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26274.7*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26441.1*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26130.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26412.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27637.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27132.21*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27187.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27061.01*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27643.5*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27170.01*H",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26693.04*H",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26888.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27873.01*H",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27881.11*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27386.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26948.11*H",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "FREDERIKSTED  ST CROIX",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "WILBUR",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "LA CENTER",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "WENATCHEE",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "CENTER ISLAND",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "QUILLAYUTE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "IRMA",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "EAU CLAIRE",
    "weight": 5
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "ARBOVALE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "CODY",
    "weight": 4
  },
  {
    "fieldName": "county",
    "fieldType": "string",
    "fieldValue": "FOREST",
    "weight": 3
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "STOUFFER GRAND BEACH RESORT",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "SHELBURNE",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "MC MAHON FIELD",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "SKATTER CREEK",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "BURNT RIDGE AIRSTRIP",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "THE BOEING COMPANY",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "ENUMCLAW",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "STUPEK FARMS",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "JOHN DEGERMAN FIELD",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "NORTHWOODS HOSPITAL",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "FORT ATKINSON MEMORIAL HOSPITAL",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "JOSHUA SANFORD FIELD",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "ROX",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "FIVE CORNERS AIRWAYS",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "WAUSAU HOSPITAL CENTER",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "STEVENS POINT MUNI",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "KENOSHA REGIONAL",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "CENTER WHEELING PARKING GARAGE",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "WALKER VI",
    "weight": 1
  },
  {
    "fieldName": "cert",
    "fieldType": "string",
    "fieldValue": "AU 04/1981",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "VT34",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "9WA3",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "5WI0",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "38WI",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "05WI",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "WS04",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "W19",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "WY37",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "THP",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26245.9*U",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26402.01*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26455.8*H",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26374.1*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26275.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27200.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27388.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27896.6*A",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "WOODINVILLE",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "SUNNYSIDE",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "KEWAUNEE",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "DRESSER",
    "weight": 1
  },
  {
    "fieldName": "city",
    "fieldType": "string",
    "fieldValue": "ST CROIX FALLS",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "E.A.DEEDS FARM",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "LAKE UNION",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "JEFFERSON GENERAL HOSPITAL",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "CHILCOTT FARMS",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "DEER HAVEN RANCH",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "MC MANUS HOONCH-NA-SHEE-KAW",
    "weight": 1
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "IBERLIN STRIP",
    "weight": 2
  },
  {
    "fieldName": "full_name",
    "fieldType": "string",
    "fieldValue": "WARREN-SUGARBUSH",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "ORF",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "75WI",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "WV33",
    "weight": 1
  },
  {
    "fieldName": "code",
    "fieldType": "string",
    "fieldValue": "WV05",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26477.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26361.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27562.01*H",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27291.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "27677.01*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26899.*A",
    "weight": 1
  },
  {
    "fieldName": "site_number",
    "fieldType": "string",
    "fieldValue": "26660.5*A",
    "weight": 1
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'id'
      WHEN 1 THEN 'code'
      WHEN 2 THEN 'site_number'
      WHEN 3 THEN 'fac_type'
      WHEN 4 THEN 'fac_use'
      WHEN 5 THEN 'faa_region'
      WHEN 6 THEN 'faa_dist'
      WHEN 7 THEN 'city'
      WHEN 8 THEN 'county'
      WHEN 9 THEN 'state'
      WHEN 10 THEN 'full_name'
      WHEN 11 THEN 'own_type'
      WHEN 12 THEN 'longitude'
      WHEN 13 THEN 'latitude'
      WHEN 14 THEN 'elevation'
      WHEN 15 THEN 'aero_cht'
      WHEN 16 THEN 'cbd_dist'
      WHEN 17 THEN 'cbd_dir'
      WHEN 18 THEN 'act_date'
      WHEN 19 THEN 'cert'
      WHEN 20 THEN 'fed_agree'
      WHEN 21 THEN 'cust_intl'
      WHEN 22 THEN 'c_ldg_rts'
      WHEN 23 THEN 'joint_use'
      WHEN 24 THEN 'mil_rts'
      WHEN 25 THEN 'cntl_twr'
      WHEN 26 THEN 'major'
    END as "fieldName",  CASE group_set
      WHEN 0 THEN 'number'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'string'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'string'
      WHEN 12 THEN 'number'
      WHEN 13 THEN 'number'
      WHEN 14 THEN 'number'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'number'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'string'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'string'
      WHEN 21 THEN 'string'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN NULL
      WHEN 1 THEN base."code"
      WHEN 2 THEN base."site_number"
      WHEN 3 THEN base."fac_type"
      WHEN 4 THEN base."fac_use"
      WHEN 5 THEN base."faa_region"
      WHEN 6 THEN base."faa_dist"
      WHEN 7 THEN base."city"
      WHEN 8 THEN base."county"
      WHEN 9 THEN base."state"
      WHEN 10 THEN base."full_name"
      WHEN 11 THEN base."own_type"
      WHEN 15 THEN base."aero_cht"
      WHEN 17 THEN base."cbd_dir"
      WHEN 18 THEN base."act_date"
      WHEN 19 THEN base."cert"
      WHEN 20 THEN base."fed_agree"
      WHEN 21 THEN base."cust_intl"
      WHEN 22 THEN base."c_ldg_rts"
      WHEN 23 THEN base."joint_use"
      WHEN 24 THEN base."mil_rts"
      WHEN 25 THEN base."cntl_twr"
      WHEN 26 THEN base."major"
    END as "fieldValue",
   COUNT(*) as weight,
    CASE group_set
      WHEN 99999 THEN ''    WHEN 0 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 12 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
      WHEN 13 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 14 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 16 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4
)
SELECT
  "fieldName",
  "fieldType",
  COALESCE("fieldValue", "fieldRange") as "fieldValue",
  weight
FROM __stage0

Add Ordering

Adding a second query stage to filter on string columns and ordering by weight descending shows us the most common fieldName/fieldValue pairs in the dataset.

All Malloy queries run as a single SQL query. The index: operator is no different. Click the SQL tab to see how this works.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
} -> {
  where: fieldType = 'string'
  select: *
  order_by: weight desc
}
QUERY RESULTS
fieldNamefieldValuefieldTypeweight
majorNstring19,523
certstring19,142
cntl_twrNstring19,124
fed_agreestring16,252
cust_intlstring15,145
c_ldg_rtsstring15,145
joint_usestring14,804
mil_rtsstring14,716
fac_usePRstring14,428
own_typePRstring14,306
fac_typeAIRPORTstring13,925
act_datestring12,040
faa_distNONEstring7,085
fac_usePUstring5,365
own_typePUstring5,174
fac_typeHELIPORTstring5,135
joint_useNstring4,779
cust_intlNstring4,575
faa_regionAGLstring4,437
c_ldg_rtsNstring4,365
cbd_dirNstring3,694
faa_regionASWstring3,268
mil_rtsYstring2,958
faa_regionASOstring2,924
faa_regionAEAstring2,586
cbd_dirNWstring2,438
cbd_dirSWstring2,427
cbd_dirNEstring2,393
cbd_dirSEstring2,281
cbd_dirSstring2,155
cbd_dirEstring2,127
mil_rtsNstring2,119
faa_regionANMstring2,102
cbd_dirWstring2,065
stateTXstring1,845
fed_agreeNGYstring1,682
aero_chtNEW YORKstring1,581
faa_regionACEstring1,579
faa_distCHIstring1,528
faa_regionAWPstring1,503
aero_chtCHICAGOstring1,480
aero_chtDETROITstring1,262
faa_distDETstring1,240
faa_distSEAstring1,162
faa_distMSPstring1,046
faa_distATLstring1,029
stateCAstring984
faa_distNYCstring957
faa_distORLstring905
stateILstring890
aero_chtDALLAS-FT WORTHstring888
aero_chtWASHINGTONstring887
stateFLstring856
aero_chtST LOUISstring847
faa_distHARstring846
statePAstring804
aero_chtATLANTAstring780
aero_chtHOUSTONstring765
faa_regionANEstring763
stateOHstring749
aero_chtSEATTLEstring749
aero_chtKANSAS CITYstring713
aero_chtTWIN CITIESstring699
faa_distDENstring679
cntl_twrYstring669
faa_distDCAstring667
stateINstring643
aero_chtCINCINNATIstring624
faa_distBISstring616
faa_regionAALstring608
stateAKstring608
aero_chtMEMPHISstring605
faa_distSFOstring585
stateNYstring576
aero_chtOMAHAstring551
stateWIstring543
stateMOstring537
fed_agreeNstring515
stateMNstring507
aero_chtSAN ANTONIOstring503
faa_distJANstring502
stateLAstring500
aero_chtLOS ANGELESstring498
stateMIstring489
faa_distMEMstring487
stateWAstring484
fac_typeSEAPLANE BASEstring473
aero_chtJACKSONVILLEstring467
aero_chtNEW ORLEANSstring444
stateOKstring443
stateORstring441
stateGAstring440
stateNDstring436
aero_chtSAN FRANCISCOstring427
stateCOstring425
stateVAstring421
aero_chtGREEN BAYstring418
stateKSstring415
stateNCstring400
aero_chtCHARLOTTEstring380
[
  {
    "fieldName": "major",
    "fieldValue": "N",
    "fieldType": "string",
    "weight": 19523
  },
  {
    "fieldName": "cert",
    "fieldValue": null,
    "fieldType": "string",
    "weight": 19142
  },
  {
    "fieldName": "cntl_twr",
    "fieldValue": "N",
    "fieldType": "string",
    "weight": 19124
  },
  {
    "fieldName": "fed_agree",
    "fieldValue": null,
    "fieldType": "string",
    "weight": 16252
  },
  {
    "fieldName": "cust_intl",
    "fieldValue": null,
    "fieldType": "string",
    "weight": 15145
  },
  {
    "fieldName": "c_ldg_rts",
    "fieldValue": null,
    "fieldType": "string",
    "weight": 15145
  },
  {
    "fieldName": "joint_use",
    "fieldValue": null,
    "fieldType": "string",
    "weight": 14804
  },
  {
    "fieldName": "mil_rts",
    "fieldValue": null,
    "fieldType": "string",
    "weight": 14716
  },
  {
    "fieldName": "fac_use",
    "fieldValue": "PR",
    "fieldType": "string",
    "weight": 14428
  },
  {
    "fieldName": "own_type",
    "fieldValue": "PR",
    "fieldType": "string",
    "weight": 14306
  },
  {
    "fieldName": "fac_type",
    "fieldValue": "AIRPORT",
    "fieldType": "string",
    "weight": 13925
  },
  {
    "fieldName": "act_date",
    "fieldValue": null,
    "fieldType": "string",
    "weight": 12040
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "NONE",
    "fieldType": "string",
    "weight": 7085
  },
  {
    "fieldName": "fac_use",
    "fieldValue": "PU",
    "fieldType": "string",
    "weight": 5365
  },
  {
    "fieldName": "own_type",
    "fieldValue": "PU",
    "fieldType": "string",
    "weight": 5174
  },
  {
    "fieldName": "fac_type",
    "fieldValue": "HELIPORT",
    "fieldType": "string",
    "weight": 5135
  },
  {
    "fieldName": "joint_use",
    "fieldValue": "N",
    "fieldType": "string",
    "weight": 4779
  },
  {
    "fieldName": "cust_intl",
    "fieldValue": "N",
    "fieldType": "string",
    "weight": 4575
  },
  {
    "fieldName": "faa_region",
    "fieldValue": "AGL",
    "fieldType": "string",
    "weight": 4437
  },
  {
    "fieldName": "c_ldg_rts",
    "fieldValue": "N",
    "fieldType": "string",
    "weight": 4365
  },
  {
    "fieldName": "cbd_dir",
    "fieldValue": "N",
    "fieldType": "string",
    "weight": 3694
  },
  {
    "fieldName": "faa_region",
    "fieldValue": "ASW",
    "fieldType": "string",
    "weight": 3268
  },
  {
    "fieldName": "mil_rts",
    "fieldValue": "Y",
    "fieldType": "string",
    "weight": 2958
  },
  {
    "fieldName": "faa_region",
    "fieldValue": "ASO",
    "fieldType": "string",
    "weight": 2924
  },
  {
    "fieldName": "faa_region",
    "fieldValue": "AEA",
    "fieldType": "string",
    "weight": 2586
  },
  {
    "fieldName": "cbd_dir",
    "fieldValue": "NW",
    "fieldType": "string",
    "weight": 2438
  },
  {
    "fieldName": "cbd_dir",
    "fieldValue": "SW",
    "fieldType": "string",
    "weight": 2427
  },
  {
    "fieldName": "cbd_dir",
    "fieldValue": "NE",
    "fieldType": "string",
    "weight": 2393
  },
  {
    "fieldName": "cbd_dir",
    "fieldValue": "SE",
    "fieldType": "string",
    "weight": 2281
  },
  {
    "fieldName": "cbd_dir",
    "fieldValue": "S",
    "fieldType": "string",
    "weight": 2155
  },
  {
    "fieldName": "cbd_dir",
    "fieldValue": "E",
    "fieldType": "string",
    "weight": 2127
  },
  {
    "fieldName": "mil_rts",
    "fieldValue": "N",
    "fieldType": "string",
    "weight": 2119
  },
  {
    "fieldName": "faa_region",
    "fieldValue": "ANM",
    "fieldType": "string",
    "weight": 2102
  },
  {
    "fieldName": "cbd_dir",
    "fieldValue": "W",
    "fieldType": "string",
    "weight": 2065
  },
  {
    "fieldName": "state",
    "fieldValue": "TX",
    "fieldType": "string",
    "weight": 1845
  },
  {
    "fieldName": "fed_agree",
    "fieldValue": "NGY",
    "fieldType": "string",
    "weight": 1682
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "NEW YORK",
    "fieldType": "string",
    "weight": 1581
  },
  {
    "fieldName": "faa_region",
    "fieldValue": "ACE",
    "fieldType": "string",
    "weight": 1579
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "CHI",
    "fieldType": "string",
    "weight": 1528
  },
  {
    "fieldName": "faa_region",
    "fieldValue": "AWP",
    "fieldType": "string",
    "weight": 1503
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "CHICAGO",
    "fieldType": "string",
    "weight": 1480
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "DETROIT",
    "fieldType": "string",
    "weight": 1262
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "DET",
    "fieldType": "string",
    "weight": 1240
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "SEA",
    "fieldType": "string",
    "weight": 1162
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "MSP",
    "fieldType": "string",
    "weight": 1046
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "ATL",
    "fieldType": "string",
    "weight": 1029
  },
  {
    "fieldName": "state",
    "fieldValue": "CA",
    "fieldType": "string",
    "weight": 984
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "NYC",
    "fieldType": "string",
    "weight": 957
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "ORL",
    "fieldType": "string",
    "weight": 905
  },
  {
    "fieldName": "state",
    "fieldValue": "IL",
    "fieldType": "string",
    "weight": 890
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "DALLAS-FT WORTH",
    "fieldType": "string",
    "weight": 888
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "WASHINGTON",
    "fieldType": "string",
    "weight": 887
  },
  {
    "fieldName": "state",
    "fieldValue": "FL",
    "fieldType": "string",
    "weight": 856
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "ST LOUIS",
    "fieldType": "string",
    "weight": 847
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "HAR",
    "fieldType": "string",
    "weight": 846
  },
  {
    "fieldName": "state",
    "fieldValue": "PA",
    "fieldType": "string",
    "weight": 804
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "ATLANTA",
    "fieldType": "string",
    "weight": 780
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "HOUSTON",
    "fieldType": "string",
    "weight": 765
  },
  {
    "fieldName": "faa_region",
    "fieldValue": "ANE",
    "fieldType": "string",
    "weight": 763
  },
  {
    "fieldName": "state",
    "fieldValue": "OH",
    "fieldType": "string",
    "weight": 749
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "SEATTLE",
    "fieldType": "string",
    "weight": 749
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "KANSAS CITY",
    "fieldType": "string",
    "weight": 713
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "TWIN CITIES",
    "fieldType": "string",
    "weight": 699
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "DEN",
    "fieldType": "string",
    "weight": 679
  },
  {
    "fieldName": "cntl_twr",
    "fieldValue": "Y",
    "fieldType": "string",
    "weight": 669
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "DCA",
    "fieldType": "string",
    "weight": 667
  },
  {
    "fieldName": "state",
    "fieldValue": "IN",
    "fieldType": "string",
    "weight": 643
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "CINCINNATI",
    "fieldType": "string",
    "weight": 624
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "BIS",
    "fieldType": "string",
    "weight": 616
  },
  {
    "fieldName": "faa_region",
    "fieldValue": "AAL",
    "fieldType": "string",
    "weight": 608
  },
  {
    "fieldName": "state",
    "fieldValue": "AK",
    "fieldType": "string",
    "weight": 608
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "MEMPHIS",
    "fieldType": "string",
    "weight": 605
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "SFO",
    "fieldType": "string",
    "weight": 585
  },
  {
    "fieldName": "state",
    "fieldValue": "NY",
    "fieldType": "string",
    "weight": 576
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "OMAHA",
    "fieldType": "string",
    "weight": 551
  },
  {
    "fieldName": "state",
    "fieldValue": "WI",
    "fieldType": "string",
    "weight": 543
  },
  {
    "fieldName": "state",
    "fieldValue": "MO",
    "fieldType": "string",
    "weight": 537
  },
  {
    "fieldName": "fed_agree",
    "fieldValue": "N",
    "fieldType": "string",
    "weight": 515
  },
  {
    "fieldName": "state",
    "fieldValue": "MN",
    "fieldType": "string",
    "weight": 507
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "SAN ANTONIO",
    "fieldType": "string",
    "weight": 503
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "JAN",
    "fieldType": "string",
    "weight": 502
  },
  {
    "fieldName": "state",
    "fieldValue": "LA",
    "fieldType": "string",
    "weight": 500
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "LOS ANGELES",
    "fieldType": "string",
    "weight": 498
  },
  {
    "fieldName": "state",
    "fieldValue": "MI",
    "fieldType": "string",
    "weight": 489
  },
  {
    "fieldName": "faa_dist",
    "fieldValue": "MEM",
    "fieldType": "string",
    "weight": 487
  },
  {
    "fieldName": "state",
    "fieldValue": "WA",
    "fieldType": "string",
    "weight": 484
  },
  {
    "fieldName": "fac_type",
    "fieldValue": "SEAPLANE BASE",
    "fieldType": "string",
    "weight": 473
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "JACKSONVILLE",
    "fieldType": "string",
    "weight": 467
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "NEW ORLEANS",
    "fieldType": "string",
    "weight": 444
  },
  {
    "fieldName": "state",
    "fieldValue": "OK",
    "fieldType": "string",
    "weight": 443
  },
  {
    "fieldName": "state",
    "fieldValue": "OR",
    "fieldType": "string",
    "weight": 441
  },
  {
    "fieldName": "state",
    "fieldValue": "GA",
    "fieldType": "string",
    "weight": 440
  },
  {
    "fieldName": "state",
    "fieldValue": "ND",
    "fieldType": "string",
    "weight": 436
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "SAN FRANCISCO",
    "fieldType": "string",
    "weight": 427
  },
  {
    "fieldName": "state",
    "fieldValue": "CO",
    "fieldType": "string",
    "weight": 425
  },
  {
    "fieldName": "state",
    "fieldValue": "VA",
    "fieldType": "string",
    "weight": 421
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "GREEN BAY",
    "fieldType": "string",
    "weight": 418
  },
  {
    "fieldName": "state",
    "fieldValue": "KS",
    "fieldType": "string",
    "weight": 415
  },
  {
    "fieldName": "state",
    "fieldValue": "NC",
    "fieldType": "string",
    "weight": 400
  },
  {
    "fieldName": "aero_cht",
    "fieldValue": "CHARLOTTE",
    "fieldType": "string",
    "weight": 380
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'id'
      WHEN 1 THEN 'code'
      WHEN 2 THEN 'site_number'
      WHEN 3 THEN 'fac_type'
      WHEN 4 THEN 'fac_use'
      WHEN 5 THEN 'faa_region'
      WHEN 6 THEN 'faa_dist'
      WHEN 7 THEN 'city'
      WHEN 8 THEN 'county'
      WHEN 9 THEN 'state'
      WHEN 10 THEN 'full_name'
      WHEN 11 THEN 'own_type'
      WHEN 12 THEN 'longitude'
      WHEN 13 THEN 'latitude'
      WHEN 14 THEN 'elevation'
      WHEN 15 THEN 'aero_cht'
      WHEN 16 THEN 'cbd_dist'
      WHEN 17 THEN 'cbd_dir'
      WHEN 18 THEN 'act_date'
      WHEN 19 THEN 'cert'
      WHEN 20 THEN 'fed_agree'
      WHEN 21 THEN 'cust_intl'
      WHEN 22 THEN 'c_ldg_rts'
      WHEN 23 THEN 'joint_use'
      WHEN 24 THEN 'mil_rts'
      WHEN 25 THEN 'cntl_twr'
      WHEN 26 THEN 'major'
    END as "fieldName",  CASE group_set
      WHEN 0 THEN 'number'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'string'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'string'
      WHEN 12 THEN 'number'
      WHEN 13 THEN 'number'
      WHEN 14 THEN 'number'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'number'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'string'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'string'
      WHEN 21 THEN 'string'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN NULL
      WHEN 1 THEN base."code"
      WHEN 2 THEN base."site_number"
      WHEN 3 THEN base."fac_type"
      WHEN 4 THEN base."fac_use"
      WHEN 5 THEN base."faa_region"
      WHEN 6 THEN base."faa_dist"
      WHEN 7 THEN base."city"
      WHEN 8 THEN base."county"
      WHEN 9 THEN base."state"
      WHEN 10 THEN base."full_name"
      WHEN 11 THEN base."own_type"
      WHEN 15 THEN base."aero_cht"
      WHEN 17 THEN base."cbd_dir"
      WHEN 18 THEN base."act_date"
      WHEN 19 THEN base."cert"
      WHEN 20 THEN base."fed_agree"
      WHEN 21 THEN base."cust_intl"
      WHEN 22 THEN base."c_ldg_rts"
      WHEN 23 THEN base."joint_use"
      WHEN 24 THEN base."mil_rts"
      WHEN 25 THEN base."cntl_twr"
      WHEN 26 THEN base."major"
    END as "fieldValue",
   COUNT(*) as weight,
    CASE group_set
      WHEN 99999 THEN ''    WHEN 0 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 12 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
      WHEN 13 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 14 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 16 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4
)
, __stage1 AS (
  SELECT
    "fieldName",
    "fieldType",
    COALESCE("fieldValue", "fieldRange") as "fieldValue",
    weight
  FROM __stage0
)
SELECT 
   base."fieldName" as "fieldName",
   base."fieldValue" as "fieldValue",
   base."fieldType" as "fieldType",
   base."weight" as "weight"
FROM __stage1 as base
WHERE base."fieldType"='string'
ORDER BY 4 desc NULLS LAST

Index For Filtering User Interfaces

Indexes can be used find the best way to filter a dataset. For example supposed we'd like to find 'SANTA CRUZ' in the dataset. Upon approaching the dataset, but we don't which column might contain it. In a UI you might imagine that you type 'SANTA' and let have suggestons for values that might be appropriate. In the results we can see that top value, 'SANTA ROSA', appears as county on 26 rows in the table. We can also see that 'SANTA CRUZ' is both a city and a county..

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
} -> {
  where: fieldValue ~ r'SANTA'
  select: *
  order_by: weight desc
  limit: 15
}
QUERY RESULTS
fieldNamefieldValuefieldTypeweight
countySANTA ROSAstring26
countySANTA BARBARAstring22
countySANTA CRUZstring10
countySANTA CLARAstring10
countySANTA FEstring9
citySANTA ANAstring6
citySANTA FEstring6
citySANTA BARBARAstring5
citySANTA ROSAstring4
citySANTA MARIAstring3
citySANTA YNEZstring3
citySANTA CRUZstring2
citySANTA YSABELstring2
citySANTA ELENAstring2
cityRANCHO SANTA MARGARITAstring2
[
  {
    "fieldName": "county",
    "fieldValue": "SANTA ROSA",
    "fieldType": "string",
    "weight": 26
  },
  {
    "fieldName": "county",
    "fieldValue": "SANTA BARBARA",
    "fieldType": "string",
    "weight": 22
  },
  {
    "fieldName": "county",
    "fieldValue": "SANTA CRUZ",
    "fieldType": "string",
    "weight": 10
  },
  {
    "fieldName": "county",
    "fieldValue": "SANTA CLARA",
    "fieldType": "string",
    "weight": 10
  },
  {
    "fieldName": "county",
    "fieldValue": "SANTA FE",
    "fieldType": "string",
    "weight": 9
  },
  {
    "fieldName": "city",
    "fieldValue": "SANTA ANA",
    "fieldType": "string",
    "weight": 6
  },
  {
    "fieldName": "city",
    "fieldValue": "SANTA FE",
    "fieldType": "string",
    "weight": 6
  },
  {
    "fieldName": "city",
    "fieldValue": "SANTA BARBARA",
    "fieldType": "string",
    "weight": 5
  },
  {
    "fieldName": "city",
    "fieldValue": "SANTA ROSA",
    "fieldType": "string",
    "weight": 4
  },
  {
    "fieldName": "city",
    "fieldValue": "SANTA MARIA",
    "fieldType": "string",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldValue": "SANTA YNEZ",
    "fieldType": "string",
    "weight": 3
  },
  {
    "fieldName": "city",
    "fieldValue": "SANTA CRUZ",
    "fieldType": "string",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldValue": "SANTA YSABEL",
    "fieldType": "string",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldValue": "SANTA ELENA",
    "fieldType": "string",
    "weight": 2
  },
  {
    "fieldName": "city",
    "fieldValue": "RANCHO SANTA MARGARITA",
    "fieldType": "string",
    "weight": 2
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'id'
      WHEN 1 THEN 'code'
      WHEN 2 THEN 'site_number'
      WHEN 3 THEN 'fac_type'
      WHEN 4 THEN 'fac_use'
      WHEN 5 THEN 'faa_region'
      WHEN 6 THEN 'faa_dist'
      WHEN 7 THEN 'city'
      WHEN 8 THEN 'county'
      WHEN 9 THEN 'state'
      WHEN 10 THEN 'full_name'
      WHEN 11 THEN 'own_type'
      WHEN 12 THEN 'longitude'
      WHEN 13 THEN 'latitude'
      WHEN 14 THEN 'elevation'
      WHEN 15 THEN 'aero_cht'
      WHEN 16 THEN 'cbd_dist'
      WHEN 17 THEN 'cbd_dir'
      WHEN 18 THEN 'act_date'
      WHEN 19 THEN 'cert'
      WHEN 20 THEN 'fed_agree'
      WHEN 21 THEN 'cust_intl'
      WHEN 22 THEN 'c_ldg_rts'
      WHEN 23 THEN 'joint_use'
      WHEN 24 THEN 'mil_rts'
      WHEN 25 THEN 'cntl_twr'
      WHEN 26 THEN 'major'
    END as "fieldName",  CASE group_set
      WHEN 0 THEN 'number'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'string'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'string'
      WHEN 12 THEN 'number'
      WHEN 13 THEN 'number'
      WHEN 14 THEN 'number'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'number'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'string'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'string'
      WHEN 21 THEN 'string'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN NULL
      WHEN 1 THEN base."code"
      WHEN 2 THEN base."site_number"
      WHEN 3 THEN base."fac_type"
      WHEN 4 THEN base."fac_use"
      WHEN 5 THEN base."faa_region"
      WHEN 6 THEN base."faa_dist"
      WHEN 7 THEN base."city"
      WHEN 8 THEN base."county"
      WHEN 9 THEN base."state"
      WHEN 10 THEN base."full_name"
      WHEN 11 THEN base."own_type"
      WHEN 15 THEN base."aero_cht"
      WHEN 17 THEN base."cbd_dir"
      WHEN 18 THEN base."act_date"
      WHEN 19 THEN base."cert"
      WHEN 20 THEN base."fed_agree"
      WHEN 21 THEN base."cust_intl"
      WHEN 22 THEN base."c_ldg_rts"
      WHEN 23 THEN base."joint_use"
      WHEN 24 THEN base."mil_rts"
      WHEN 25 THEN base."cntl_twr"
      WHEN 26 THEN base."major"
    END as "fieldValue",
   COUNT(*) as weight,
    CASE group_set
      WHEN 99999 THEN ''    WHEN 0 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 12 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
      WHEN 13 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 14 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 16 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4
)
, __stage1 AS (
  SELECT
    "fieldName",
    "fieldType",
    COALESCE("fieldValue", "fieldRange") as "fieldValue",
    weight
  FROM __stage0
)
SELECT 
   base."fieldName" as "fieldName",
   base."fieldValue" as "fieldValue",
   base."fieldType" as "fieldType",
   base."weight" as "weight"
FROM __stage1 as base
WHERE REGEXP_MATCHES(base."fieldValue", 'SANTA')
ORDER BY 4 desc NULLS LAST
LIMIT 15

We can then write a simple query to show the rows. It turns out that 'SANTA CRUZ' is a county in both California and Arizona.

document
run: duckdb.table('../data/airports.parquet') -> {
  where: county ~ 'SANTA CRUZ'
  select: *
}
QUERY RESULTS
idcodesite_​numberfac_​typefac_​usefaa_​regionfaa_​distcitycountystatefull_​nameown_​typelongitudelatitudeelevationaero_​chtcbd_​distcbd_​diract_​datecertfed_​agreecust_​intlc_​ldg_​rtsjoint_​usemil_​rtscntl_​twrmajor
1,4442AZ800811.*UULTRALIGHTPRAWPNONETUBACSANTA CRUZAZTUBAC ULTRALIGHT FLIGHTPARKPR-111.0331.613,200PHOENIX1E10/1987NN
1,312OLS00739.*AAIRPORTPUAWPNONENOGALESSANTA CRUZAZNOGALES INTLPU-110.8431.413,955PHOENIX7NENGYYNNYNN
2,287NSI02220.*AAIRPORTPRAWPNONESAN NICOLAS ISLANDSANTA CRUZCASAN NICOLAS ISLAND NOLFMN-119.4533.23504LOS ANGELES0YYN
2,307CL7702241.1*AAIRPORTPRAWPSFOSANTA CRUZSANTA CRUZCABONNY DOON VILLAGEPR-122.1237.072,020SAN FRANCISCO8NW07/1986NN
2,308CA3702241.12*HHELIPORTPRAWPSFOSANTA CRUZSANTA CRUZCADOMINICAN SANTA CRUZ HOSPITALPR-121.9836.99115SAN FRANCISCO3NENN
1,6936Q601486.9*AAIRPORTPRAWPSFODAVENPORTSANTA CRUZCALAS TRANCASPR-122.2737.08125SAN FRANCISCO6NWNNNN
2,441CA6502429.1*HHELIPORTPRAWPSFOWATSONVILLESANTA CRUZCAALTA VISTAPR-121.8636.97480SAN FRANCISCO5NWNNNN
2,442CL9902429.11*HHELIPORTPRAWPSFOWATSONVILLESANTA CRUZCAWATSONVILLE COMMUNITY HOSPITALPU-121.7736.93111SAN FRANCISCO0E09/1987NN
2,443CA6602429.2*AAIRPORTPRAWPSFOWATSONVILLESANTA CRUZCAMONTEREY BAY ACADEMYPR-121.8436.970SAN FRANCISCO4WNNNN
2,440WVI02429.*AAIRPORTPUAWPSFOWATSONVILLESANTA CRUZCAWATSONVILLE MUNIPU-121.7836.93160SAN FRANCISCO3NWNGPRYNNNYNN
[
  {
    "id": 1444,
    "code": "2AZ8",
    "site_number": "00811.*U",
    "fac_type": "ULTRALIGHT",
    "fac_use": "PR",
    "faa_region": "AWP",
    "faa_dist": "NONE",
    "city": "TUBAC",
    "county": "SANTA CRUZ",
    "state": "AZ",
    "full_name": "TUBAC ULTRALIGHT FLIGHTPARK",
    "own_type": "PR",
    "longitude": -111.03,
    "latitude": 31.61,
    "elevation": 3200,
    "aero_cht": "PHOENIX",
    "cbd_dist": 1,
    "cbd_dir": "E",
    "act_date": "10/1987",
    "cert": null,
    "fed_agree": null,
    "cust_intl": null,
    "c_ldg_rts": null,
    "joint_use": null,
    "mil_rts": null,
    "cntl_twr": "N",
    "major": "N"
  },
  {
    "id": 1312,
    "code": "OLS",
    "site_number": "00739.*A",
    "fac_type": "AIRPORT",
    "fac_use": "PU",
    "faa_region": "AWP",
    "faa_dist": "NONE",
    "city": "NOGALES",
    "county": "SANTA CRUZ",
    "state": "AZ",
    "full_name": "NOGALES INTL",
    "own_type": "PU",
    "longitude": -110.84,
    "latitude": 31.41,
    "elevation": 3955,
    "aero_cht": "PHOENIX",
    "cbd_dist": 7,
    "cbd_dir": "NE",
    "act_date": null,
    "cert": null,
    "fed_agree": "NGY",
    "cust_intl": "Y",
    "c_ldg_rts": "N",
    "joint_use": "N",
    "mil_rts": "Y",
    "cntl_twr": "N",
    "major": "N"
  },
  {
    "id": 2287,
    "code": "NSI",
    "site_number": "02220.*A",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "faa_region": "AWP",
    "faa_dist": "NONE",
    "city": "SAN NICOLAS ISLAND",
    "county": "SANTA CRUZ",
    "state": "CA",
    "full_name": "SAN NICOLAS ISLAND NOLF",
    "own_type": "MN",
    "longitude": -119.45,
    "latitude": 33.23,
    "elevation": 504,
    "aero_cht": "LOS ANGELES",
    "cbd_dist": 0,
    "cbd_dir": null,
    "act_date": null,
    "cert": null,
    "fed_agree": null,
    "cust_intl": null,
    "c_ldg_rts": null,
    "joint_use": null,
    "mil_rts": "Y",
    "cntl_twr": "Y",
    "major": "N"
  },
  {
    "id": 2307,
    "code": "CL77",
    "site_number": "02241.1*A",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "faa_region": "AWP",
    "faa_dist": "SFO",
    "city": "SANTA CRUZ",
    "county": "SANTA CRUZ",
    "state": "CA",
    "full_name": "BONNY DOON VILLAGE",
    "own_type": "PR",
    "longitude": -122.12,
    "latitude": 37.07,
    "elevation": 2020,
    "aero_cht": "SAN FRANCISCO",
    "cbd_dist": 8,
    "cbd_dir": "NW",
    "act_date": "07/1986",
    "cert": null,
    "fed_agree": null,
    "cust_intl": null,
    "c_ldg_rts": null,
    "joint_use": null,
    "mil_rts": null,
    "cntl_twr": "N",
    "major": "N"
  },
  {
    "id": 2308,
    "code": "CA37",
    "site_number": "02241.12*H",
    "fac_type": "HELIPORT",
    "fac_use": "PR",
    "faa_region": "AWP",
    "faa_dist": "SFO",
    "city": "SANTA CRUZ",
    "county": "SANTA CRUZ",
    "state": "CA",
    "full_name": "DOMINICAN SANTA CRUZ HOSPITAL",
    "own_type": "PR",
    "longitude": -121.98,
    "latitude": 36.99,
    "elevation": 115,
    "aero_cht": "SAN FRANCISCO",
    "cbd_dist": 3,
    "cbd_dir": "NE",
    "act_date": null,
    "cert": null,
    "fed_agree": null,
    "cust_intl": null,
    "c_ldg_rts": null,
    "joint_use": null,
    "mil_rts": null,
    "cntl_twr": "N",
    "major": "N"
  },
  {
    "id": 1693,
    "code": "6Q6",
    "site_number": "01486.9*A",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "faa_region": "AWP",
    "faa_dist": "SFO",
    "city": "DAVENPORT",
    "county": "SANTA CRUZ",
    "state": "CA",
    "full_name": "LAS TRANCAS",
    "own_type": "PR",
    "longitude": -122.27,
    "latitude": 37.08,
    "elevation": 125,
    "aero_cht": "SAN FRANCISCO",
    "cbd_dist": 6,
    "cbd_dir": "NW",
    "act_date": null,
    "cert": null,
    "fed_agree": null,
    "cust_intl": null,
    "c_ldg_rts": null,
    "joint_use": "N",
    "mil_rts": "N",
    "cntl_twr": "N",
    "major": "N"
  },
  {
    "id": 2441,
    "code": "CA65",
    "site_number": "02429.1*H",
    "fac_type": "HELIPORT",
    "fac_use": "PR",
    "faa_region": "AWP",
    "faa_dist": "SFO",
    "city": "WATSONVILLE",
    "county": "SANTA CRUZ",
    "state": "CA",
    "full_name": "ALTA VISTA",
    "own_type": "PR",
    "longitude": -121.86,
    "latitude": 36.97,
    "elevation": 480,
    "aero_cht": "SAN FRANCISCO",
    "cbd_dist": 5,
    "cbd_dir": "NW",
    "act_date": null,
    "cert": null,
    "fed_agree": null,
    "cust_intl": null,
    "c_ldg_rts": null,
    "joint_use": "N",
    "mil_rts": "N",
    "cntl_twr": "N",
    "major": "N"
  },
  {
    "id": 2442,
    "code": "CL99",
    "site_number": "02429.11*H",
    "fac_type": "HELIPORT",
    "fac_use": "PR",
    "faa_region": "AWP",
    "faa_dist": "SFO",
    "city": "WATSONVILLE",
    "county": "SANTA CRUZ",
    "state": "CA",
    "full_name": "WATSONVILLE COMMUNITY HOSPITAL",
    "own_type": "PU",
    "longitude": -121.77,
    "latitude": 36.93,
    "elevation": 111,
    "aero_cht": "SAN FRANCISCO",
    "cbd_dist": 0,
    "cbd_dir": "E",
    "act_date": "09/1987",
    "cert": null,
    "fed_agree": null,
    "cust_intl": null,
    "c_ldg_rts": null,
    "joint_use": null,
    "mil_rts": null,
    "cntl_twr": "N",
    "major": "N"
  },
  {
    "id": 2443,
    "code": "CA66",
    "site_number": "02429.2*A",
    "fac_type": "AIRPORT",
    "fac_use": "PR",
    "faa_region": "AWP",
    "faa_dist": "SFO",
    "city": "WATSONVILLE",
    "county": "SANTA CRUZ",
    "state": "CA",
    "full_name": "MONTEREY BAY ACADEMY",
    "own_type": "PR",
    "longitude": -121.84,
    "latitude": 36.9,
    "elevation": 70,
    "aero_cht": "SAN FRANCISCO",
    "cbd_dist": 4,
    "cbd_dir": "W",
    "act_date": null,
    "cert": null,
    "fed_agree": null,
    "cust_intl": null,
    "c_ldg_rts": null,
    "joint_use": "N",
    "mil_rts": "N",
    "cntl_twr": "N",
    "major": "N"
  },
  {
    "id": 2440,
    "code": "WVI",
    "site_number": "02429.*A",
    "fac_type": "AIRPORT",
    "fac_use": "PU",
    "faa_region": "AWP",
    "faa_dist": "SFO",
    "city": "WATSONVILLE",
    "county": "SANTA CRUZ",
    "state": "CA",
    "full_name": "WATSONVILLE MUNI",
    "own_type": "PU",
    "longitude": -121.78,
    "latitude": 36.93,
    "elevation": 160,
    "aero_cht": "SAN FRANCISCO",
    "cbd_dist": 3,
    "cbd_dir": "NW",
    "act_date": null,
    "cert": null,
    "fed_agree": "NGPRY",
    "cust_intl": "N",
    "c_ldg_rts": "N",
    "joint_use": "N",
    "mil_rts": "Y",
    "cntl_twr": "N",
    "major": "N"
  }
]
SELECT 
   base."id" as "id",
   base."code" as "code",
   base."site_number" as "site_number",
   base."fac_type" as "fac_type",
   base."fac_use" as "fac_use",
   base."faa_region" as "faa_region",
   base."faa_dist" as "faa_dist",
   base."city" as "city",
   base."county" as "county",
   base."state" as "state",
   base."full_name" as "full_name",
   base."own_type" as "own_type",
   base."longitude" as "longitude",
   base."latitude" as "latitude",
   base."elevation" as "elevation",
   base."aero_cht" as "aero_cht",
   base."cbd_dist" as "cbd_dist",
   base."cbd_dir" as "cbd_dir",
   base."act_date" as "act_date",
   base."cert" as "cert",
   base."fed_agree" as "fed_agree",
   base."cust_intl" as "cust_intl",
   base."c_ldg_rts" as "c_ldg_rts",
   base."joint_use" as "joint_use",
   base."mil_rts" as "mil_rts",
   base."cntl_twr" as "cntl_twr",
   base."major" as "major"
FROM '../data/airports.parquet' as base
WHERE base."county" LIKE 'SANTA CRUZ'

Indexing to show top values for each dimension

It is often difficult to approach a new dataset. The index operator provides an intersting way to quickly gain an understanding of the dataset. By piping the results of an index another stage, we can quickly see all the interesting values for each of the interesting dimesions. Again, the weight shows the number of rows for that particular dimension/value.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
} -> {
  group_by: fieldName
  nest: values is {
    group_by: fieldValue, weight
    order_by: weight desc
    limit: 10
  }
  order_by: fieldName
}
QUERY RESULTS
fieldNamevalues
act_date
fieldValueweight
12,040
04/1993145
02/1987129
03/1993127
11/1990122
11/1986119
02/1982118
08/1988115
03/1988113
11/1988113
aero_cht
fieldValueweight
NEW YORK1,581
CHICAGO1,480
DETROIT1,262
DALLAS-FT WORTH888
WASHINGTON887
ST LOUIS847
ATLANTA780
HOUSTON765
SEATTLE749
KANSAS CITY713
c_ldg_rts
fieldValueweight
15,145
N4,365
Y283
cbd_dir
fieldValueweight
N3,694
NW2,438
SW2,427
NE2,393
SE2,281
S2,155
E2,127
W2,065
119
SSW20
cbd_dist
fieldValueweight
0 to 7319,793
cert
fieldValueweight
19,142
AS 05/197394
CS 05/197384
BS 05/197381
LU 05/197337
LU 08/199033
DS 05/197326
AU 05/197325
ES 05/197315
LU 11/19749
city
fieldValueweight
HOUSTON108
LOS ANGELES60
COLUMBUS47
SPRINGFIELD45
JACKSON42
WASHINGTON38
GREENVILLE38
CLINTON37
PHOENIX37
PHILADELPHIA35
cntl_twr
fieldValueweight
N19,124
Y669
code
fieldValueweight
Y821
S961
31IS1
L311
3MD51
ME081
GFL1
76OK1
00OR1
00A1
county
fieldValueweight
WASHINGTON214
JEFFERSON199
LOS ANGELES176
MONTGOMERY154
JACKSON140
FRANKLIN140
HARRIS137
MARION127
ORANGE125
MARICOPA117
cust_intl
fieldValueweight
15,145
N4,575
Y73
elevation
fieldValueweight
-1 to 1244219,793
faa_dist
fieldValueweight
NONE7,085
CHI1,528
DET1,240
SEA1,162
MSP1,046
ATL1,029
NYC957
ORL905
HAR846
DEN679
faa_region
fieldValueweight
AGL4,437
ASW3,268
ASO2,924
AEA2,586
ANM2,102
ACE1,579
AWP1,503
ANE763
AAL608
23
fac_type
fieldValueweight
AIRPORT13,925
HELIPORT5,135
SEAPLANE BASE473
ULTRALIGHT125
STOLPORT86
GLIDERPORT37
BALLOONPORT12
fac_use
fieldValueweight
PR14,428
PU5,365
fed_agree
fieldValueweight
16,252
NGY1,682
N515
NGY3219
NGPY171
NY1123
N197
196
NGSY87
NGPY384
full_name
fieldValueweight
MEMORIAL HOSPITAL21
SMITH14
JOHNSON13
ST MARY'S HOSPITAL11
DAVIS FIELD10
HILLTOP10
MILLER10
ST FRANCIS HOSPITAL9
DAVIS9
TAYLOR9
id
fieldValueweight
1 to 1979319,793
joint_use
fieldValueweight
14,804
N4,779
Y210
latitude
fieldValueweight
-14.18 to 71.2819,793
longitude
fieldValueweight
-100.0 to 174.1119,793
major
fieldValueweight
N19,523
Y270
mil_rts
fieldValueweight
14,716
Y2,958
N2,119
own_type
fieldValueweight
PR14,306
PU5,174
MR128
MA107
MN78
site_number
fieldValueweight
05874.*A1
05628.96*A1
09829.6*A1
12293.*A1
17024.*A1
20370.*H1
22753.62*A1
23149.*A1
12223.3*A1
18835.11*H1
state
fieldValueweight
TX1,845
CA984
IL890
FL856
PA804
OH749
IN643
AK608
NY576
WI543
fieldValueweight
19,793
[
  {
    "fieldName": "act_date",
    "values": [
      {
        "fieldValue": null,
        "weight": 12040
      },
      {
        "fieldValue": "04/1993",
        "weight": 145
      },
      {
        "fieldValue": "02/1987",
        "weight": 129
      },
      {
        "fieldValue": "03/1993",
        "weight": 127
      },
      {
        "fieldValue": "11/1990",
        "weight": 122
      },
      {
        "fieldValue": "11/1986",
        "weight": 119
      },
      {
        "fieldValue": "02/1982",
        "weight": 118
      },
      {
        "fieldValue": "08/1988",
        "weight": 115
      },
      {
        "fieldValue": "03/1988",
        "weight": 113
      },
      {
        "fieldValue": "11/1988",
        "weight": 113
      }
    ]
  },
  {
    "fieldName": "aero_cht",
    "values": [
      {
        "fieldValue": "NEW YORK",
        "weight": 1581
      },
      {
        "fieldValue": "CHICAGO",
        "weight": 1480
      },
      {
        "fieldValue": "DETROIT",
        "weight": 1262
      },
      {
        "fieldValue": "DALLAS-FT WORTH",
        "weight": 888
      },
      {
        "fieldValue": "WASHINGTON",
        "weight": 887
      },
      {
        "fieldValue": "ST LOUIS",
        "weight": 847
      },
      {
        "fieldValue": "ATLANTA",
        "weight": 780
      },
      {
        "fieldValue": "HOUSTON",
        "weight": 765
      },
      {
        "fieldValue": "SEATTLE",
        "weight": 749
      },
      {
        "fieldValue": "KANSAS CITY",
        "weight": 713
      }
    ]
  },
  {
    "fieldName": "c_ldg_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 15145
      },
      {
        "fieldValue": "N",
        "weight": 4365
      },
      {
        "fieldValue": "Y",
        "weight": 283
      }
    ]
  },
  {
    "fieldName": "cbd_dir",
    "values": [
      {
        "fieldValue": "N",
        "weight": 3694
      },
      {
        "fieldValue": "NW",
        "weight": 2438
      },
      {
        "fieldValue": "SW",
        "weight": 2427
      },
      {
        "fieldValue": "NE",
        "weight": 2393
      },
      {
        "fieldValue": "SE",
        "weight": 2281
      },
      {
        "fieldValue": "S",
        "weight": 2155
      },
      {
        "fieldValue": "E",
        "weight": 2127
      },
      {
        "fieldValue": "W",
        "weight": 2065
      },
      {
        "fieldValue": null,
        "weight": 119
      },
      {
        "fieldValue": "SSW",
        "weight": 20
      }
    ]
  },
  {
    "fieldName": "cbd_dist",
    "values": [
      {
        "fieldValue": "0 to 73",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "cert",
    "values": [
      {
        "fieldValue": null,
        "weight": 19142
      },
      {
        "fieldValue": "AS 05/1973",
        "weight": 94
      },
      {
        "fieldValue": "CS 05/1973",
        "weight": 84
      },
      {
        "fieldValue": "BS 05/1973",
        "weight": 81
      },
      {
        "fieldValue": "LU 05/1973",
        "weight": 37
      },
      {
        "fieldValue": "LU 08/1990",
        "weight": 33
      },
      {
        "fieldValue": "DS 05/1973",
        "weight": 26
      },
      {
        "fieldValue": "AU 05/1973",
        "weight": 25
      },
      {
        "fieldValue": "ES 05/1973",
        "weight": 15
      },
      {
        "fieldValue": "LU 11/1974",
        "weight": 9
      }
    ]
  },
  {
    "fieldName": "city",
    "values": [
      {
        "fieldValue": "HOUSTON",
        "weight": 108
      },
      {
        "fieldValue": "LOS ANGELES",
        "weight": 60
      },
      {
        "fieldValue": "COLUMBUS",
        "weight": 47
      },
      {
        "fieldValue": "SPRINGFIELD",
        "weight": 45
      },
      {
        "fieldValue": "JACKSON",
        "weight": 42
      },
      {
        "fieldValue": "WASHINGTON",
        "weight": 38
      },
      {
        "fieldValue": "GREENVILLE",
        "weight": 38
      },
      {
        "fieldValue": "CLINTON",
        "weight": 37
      },
      {
        "fieldValue": "PHOENIX",
        "weight": 37
      },
      {
        "fieldValue": "PHILADELPHIA",
        "weight": 35
      }
    ]
  },
  {
    "fieldName": "cntl_twr",
    "values": [
      {
        "fieldValue": "N",
        "weight": 19124
      },
      {
        "fieldValue": "Y",
        "weight": 669
      }
    ]
  },
  {
    "fieldName": "code",
    "values": [
      {
        "fieldValue": "Y82",
        "weight": 1
      },
      {
        "fieldValue": "S96",
        "weight": 1
      },
      {
        "fieldValue": "31IS",
        "weight": 1
      },
      {
        "fieldValue": "L31",
        "weight": 1
      },
      {
        "fieldValue": "3MD5",
        "weight": 1
      },
      {
        "fieldValue": "ME08",
        "weight": 1
      },
      {
        "fieldValue": "GFL",
        "weight": 1
      },
      {
        "fieldValue": "76OK",
        "weight": 1
      },
      {
        "fieldValue": "00OR",
        "weight": 1
      },
      {
        "fieldValue": "00A",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "county",
    "values": [
      {
        "fieldValue": "WASHINGTON",
        "weight": 214
      },
      {
        "fieldValue": "JEFFERSON",
        "weight": 199
      },
      {
        "fieldValue": "LOS ANGELES",
        "weight": 176
      },
      {
        "fieldValue": "MONTGOMERY",
        "weight": 154
      },
      {
        "fieldValue": "JACKSON",
        "weight": 140
      },
      {
        "fieldValue": "FRANKLIN",
        "weight": 140
      },
      {
        "fieldValue": "HARRIS",
        "weight": 137
      },
      {
        "fieldValue": "MARION",
        "weight": 127
      },
      {
        "fieldValue": "ORANGE",
        "weight": 125
      },
      {
        "fieldValue": "MARICOPA",
        "weight": 117
      }
    ]
  },
  {
    "fieldName": "cust_intl",
    "values": [
      {
        "fieldValue": null,
        "weight": 15145
      },
      {
        "fieldValue": "N",
        "weight": 4575
      },
      {
        "fieldValue": "Y",
        "weight": 73
      }
    ]
  },
  {
    "fieldName": "elevation",
    "values": [
      {
        "fieldValue": "-1 to 12442",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "faa_dist",
    "values": [
      {
        "fieldValue": "NONE",
        "weight": 7085
      },
      {
        "fieldValue": "CHI",
        "weight": 1528
      },
      {
        "fieldValue": "DET",
        "weight": 1240
      },
      {
        "fieldValue": "SEA",
        "weight": 1162
      },
      {
        "fieldValue": "MSP",
        "weight": 1046
      },
      {
        "fieldValue": "ATL",
        "weight": 1029
      },
      {
        "fieldValue": "NYC",
        "weight": 957
      },
      {
        "fieldValue": "ORL",
        "weight": 905
      },
      {
        "fieldValue": "HAR",
        "weight": 846
      },
      {
        "fieldValue": "DEN",
        "weight": 679
      }
    ]
  },
  {
    "fieldName": "faa_region",
    "values": [
      {
        "fieldValue": "AGL",
        "weight": 4437
      },
      {
        "fieldValue": "ASW",
        "weight": 3268
      },
      {
        "fieldValue": "ASO",
        "weight": 2924
      },
      {
        "fieldValue": "AEA",
        "weight": 2586
      },
      {
        "fieldValue": "ANM",
        "weight": 2102
      },
      {
        "fieldValue": "ACE",
        "weight": 1579
      },
      {
        "fieldValue": "AWP",
        "weight": 1503
      },
      {
        "fieldValue": "ANE",
        "weight": 763
      },
      {
        "fieldValue": "AAL",
        "weight": 608
      },
      {
        "fieldValue": null,
        "weight": 23
      }
    ]
  },
  {
    "fieldName": "fac_type",
    "values": [
      {
        "fieldValue": "AIRPORT",
        "weight": 13925
      },
      {
        "fieldValue": "HELIPORT",
        "weight": 5135
      },
      {
        "fieldValue": "SEAPLANE BASE",
        "weight": 473
      },
      {
        "fieldValue": "ULTRALIGHT",
        "weight": 125
      },
      {
        "fieldValue": "STOLPORT",
        "weight": 86
      },
      {
        "fieldValue": "GLIDERPORT",
        "weight": 37
      },
      {
        "fieldValue": "BALLOONPORT",
        "weight": 12
      }
    ]
  },
  {
    "fieldName": "fac_use",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 14428
      },
      {
        "fieldValue": "PU",
        "weight": 5365
      }
    ]
  },
  {
    "fieldName": "fed_agree",
    "values": [
      {
        "fieldValue": null,
        "weight": 16252
      },
      {
        "fieldValue": "NGY",
        "weight": 1682
      },
      {
        "fieldValue": "N",
        "weight": 515
      },
      {
        "fieldValue": "NGY3",
        "weight": 219
      },
      {
        "fieldValue": "NGPY",
        "weight": 171
      },
      {
        "fieldValue": "NY1",
        "weight": 123
      },
      {
        "fieldValue": "N1",
        "weight": 97
      },
      {
        "fieldValue": "1",
        "weight": 96
      },
      {
        "fieldValue": "NGSY",
        "weight": 87
      },
      {
        "fieldValue": "NGPY3",
        "weight": 84
      }
    ]
  },
  {
    "fieldName": "full_name",
    "values": [
      {
        "fieldValue": "MEMORIAL HOSPITAL",
        "weight": 21
      },
      {
        "fieldValue": "SMITH",
        "weight": 14
      },
      {
        "fieldValue": "JOHNSON",
        "weight": 13
      },
      {
        "fieldValue": "ST MARY'S HOSPITAL",
        "weight": 11
      },
      {
        "fieldValue": "DAVIS FIELD",
        "weight": 10
      },
      {
        "fieldValue": "HILLTOP",
        "weight": 10
      },
      {
        "fieldValue": "MILLER",
        "weight": 10
      },
      {
        "fieldValue": "ST FRANCIS HOSPITAL",
        "weight": 9
      },
      {
        "fieldValue": "DAVIS",
        "weight": 9
      },
      {
        "fieldValue": "TAYLOR",
        "weight": 9
      }
    ]
  },
  {
    "fieldName": "id",
    "values": [
      {
        "fieldValue": "1 to 19793",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "joint_use",
    "values": [
      {
        "fieldValue": null,
        "weight": 14804
      },
      {
        "fieldValue": "N",
        "weight": 4779
      },
      {
        "fieldValue": "Y",
        "weight": 210
      }
    ]
  },
  {
    "fieldName": "latitude",
    "values": [
      {
        "fieldValue": "-14.18 to 71.28",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "longitude",
    "values": [
      {
        "fieldValue": "-100.0 to 174.11",
        "weight": 19793
      }
    ]
  },
  {
    "fieldName": "major",
    "values": [
      {
        "fieldValue": "N",
        "weight": 19523
      },
      {
        "fieldValue": "Y",
        "weight": 270
      }
    ]
  },
  {
    "fieldName": "mil_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 14716
      },
      {
        "fieldValue": "Y",
        "weight": 2958
      },
      {
        "fieldValue": "N",
        "weight": 2119
      }
    ]
  },
  {
    "fieldName": "own_type",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 14306
      },
      {
        "fieldValue": "PU",
        "weight": 5174
      },
      {
        "fieldValue": "MR",
        "weight": 128
      },
      {
        "fieldValue": "MA",
        "weight": 107
      },
      {
        "fieldValue": "MN",
        "weight": 78
      }
    ]
  },
  {
    "fieldName": "site_number",
    "values": [
      {
        "fieldValue": "05874.*A",
        "weight": 1
      },
      {
        "fieldValue": "05628.96*A",
        "weight": 1
      },
      {
        "fieldValue": "09829.6*A",
        "weight": 1
      },
      {
        "fieldValue": "12293.*A",
        "weight": 1
      },
      {
        "fieldValue": "17024.*A",
        "weight": 1
      },
      {
        "fieldValue": "20370.*H",
        "weight": 1
      },
      {
        "fieldValue": "22753.62*A",
        "weight": 1
      },
      {
        "fieldValue": "23149.*A",
        "weight": 1
      },
      {
        "fieldValue": "12223.3*A",
        "weight": 1
      },
      {
        "fieldValue": "18835.11*H",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "state",
    "values": [
      {
        "fieldValue": "TX",
        "weight": 1845
      },
      {
        "fieldValue": "CA",
        "weight": 984
      },
      {
        "fieldValue": "IL",
        "weight": 890
      },
      {
        "fieldValue": "FL",
        "weight": 856
      },
      {
        "fieldValue": "PA",
        "weight": 804
      },
      {
        "fieldValue": "OH",
        "weight": 749
      },
      {
        "fieldValue": "IN",
        "weight": 643
      },
      {
        "fieldValue": "AK",
        "weight": 608
      },
      {
        "fieldValue": "NY",
        "weight": 576
      },
      {
        "fieldValue": "WI",
        "weight": 543
      }
    ]
  },
  {
    "fieldName": null,
    "values": [
      {
        "fieldValue": null,
        "weight": 19793
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'id'
      WHEN 1 THEN 'code'
      WHEN 2 THEN 'site_number'
      WHEN 3 THEN 'fac_type'
      WHEN 4 THEN 'fac_use'
      WHEN 5 THEN 'faa_region'
      WHEN 6 THEN 'faa_dist'
      WHEN 7 THEN 'city'
      WHEN 8 THEN 'county'
      WHEN 9 THEN 'state'
      WHEN 10 THEN 'full_name'
      WHEN 11 THEN 'own_type'
      WHEN 12 THEN 'longitude'
      WHEN 13 THEN 'latitude'
      WHEN 14 THEN 'elevation'
      WHEN 15 THEN 'aero_cht'
      WHEN 16 THEN 'cbd_dist'
      WHEN 17 THEN 'cbd_dir'
      WHEN 18 THEN 'act_date'
      WHEN 19 THEN 'cert'
      WHEN 20 THEN 'fed_agree'
      WHEN 21 THEN 'cust_intl'
      WHEN 22 THEN 'c_ldg_rts'
      WHEN 23 THEN 'joint_use'
      WHEN 24 THEN 'mil_rts'
      WHEN 25 THEN 'cntl_twr'
      WHEN 26 THEN 'major'
    END as "fieldName",  CASE group_set
      WHEN 0 THEN 'number'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'string'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'string'
      WHEN 12 THEN 'number'
      WHEN 13 THEN 'number'
      WHEN 14 THEN 'number'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'number'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'string'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'string'
      WHEN 21 THEN 'string'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN NULL
      WHEN 1 THEN base."code"
      WHEN 2 THEN base."site_number"
      WHEN 3 THEN base."fac_type"
      WHEN 4 THEN base."fac_use"
      WHEN 5 THEN base."faa_region"
      WHEN 6 THEN base."faa_dist"
      WHEN 7 THEN base."city"
      WHEN 8 THEN base."county"
      WHEN 9 THEN base."state"
      WHEN 10 THEN base."full_name"
      WHEN 11 THEN base."own_type"
      WHEN 15 THEN base."aero_cht"
      WHEN 17 THEN base."cbd_dir"
      WHEN 18 THEN base."act_date"
      WHEN 19 THEN base."cert"
      WHEN 20 THEN base."fed_agree"
      WHEN 21 THEN base."cust_intl"
      WHEN 22 THEN base."c_ldg_rts"
      WHEN 23 THEN base."joint_use"
      WHEN 24 THEN base."mil_rts"
      WHEN 25 THEN base."cntl_twr"
      WHEN 26 THEN base."major"
    END as "fieldValue",
   COUNT(*) as weight,
    CASE group_set
      WHEN 99999 THEN ''    WHEN 0 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 12 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
      WHEN 13 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 14 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 16 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
    END as "fieldRange"
  FROM '../data/airports.parquet' as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4
)
, __stage1 AS (
  SELECT
    "fieldName",
    "fieldType",
    COALESCE("fieldValue", "fieldRange") as "fieldValue",
    weight
  FROM __stage0
)
, __stage2 AS (
  SELECT
    group_set,
    base."fieldName" as "fieldName__0",
    CASE WHEN group_set=1 THEN
      base."fieldValue"
      END as "fieldValue__1",
    CASE WHEN group_set=1 THEN
      base."weight"
      END as "weight__1"
  FROM __stage1 as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4
)
SELECT
  "fieldName__0" as "fieldName",
  COALESCE(LIST({
    "fieldValue": "fieldValue__1", 
    "weight": "weight__1"}  ORDER BY  "weight__1" desc NULLS LAST) FILTER (WHERE group_set=1)[1:10],[]) as "values"
FROM __stage2
GROUP BY 1
ORDER BY 1 ASC NULLS LAST

Sampling

With large datasets, you can also sample a small subsection using the sample: parameter. Sampled indexes are great at identifing the important low cardinality fields.

document
run: duckdb.table('../data/airports.parquet') -> {
  index: *
  sample: 5000  // sample only 5000 rows
} -> {
  group_by: fieldName
  nest: values is  {
    group_by: fieldValue, weight
    order_by: weight desc
    limit: 10
  }
  order_by: fieldName
}
QUERY RESULTS
fieldNamevalues
act_date
fieldValueweight
3,056
04/199335
02/198735
03/199334
03/198833
06/198332
02/198232
11/198631
02/199829
08/199429
aero_cht
fieldValueweight
NEW YORK414
CHICAGO382
DETROIT343
DALLAS-FT WORTH242
WASHINGTON221
ST LOUIS199
SEATTLE192
HOUSTON191
ATLANTA185
TWIN CITIES183
c_ldg_rts
fieldValueweight
3,842
N1,081
Y77
cbd_dir
fieldValueweight
N959
SW602
NW598
SE588
S566
NE564
E545
W519
37
ENE6
cbd_dist
fieldValueweight
0 to 635,000
cert
fieldValueweight
4,845
CS 05/197323
AS 05/197318
BS 05/197317
LU 08/199012
DS 05/19736
LU 05/19736
AU 05/19735
LU 11/19744
LU 08/19764
city
fieldValueweight
HOUSTON30
SPRINGFIELD13
PORTLAND11
PHOENIX11
FORT WORTH11
COLUMBIA10
FARMINGTON10
LOS ANGELES10
GREENVILLE10
LEXINGTON10
cntl_twr
fieldValueweight
N4,837
Y163
code
fieldValueweight
11GA1
7OK31
WAY1
4IA01
AMW1
43NC1
7NY11
IS861
8FL51
LS071
county
fieldValueweight
JEFFERSON55
WASHINGTON52
LOS ANGELES41
MARION40
JACKSON38
MONTGOMERY33
LAKE33
HARRIS33
FRANKLIN32
MARICOPA29
cust_intl
fieldValueweight
3,841
N1,137
Y22
elevation
fieldValueweight
-113 to 99435,000
faa_dist
fieldValueweight
NONE1,768
CHI376
DET324
SEA292
MSP279
ATL261
NYC260
HAR220
ORL216
DCA165
faa_region
fieldValueweight
AGL1,141
ASW845
ASO728
AEA674
ANM529
ACE380
AWP345
ANE195
AAL157
6
fac_type
fieldValueweight
AIRPORT3,507
HELIPORT1,303
SEAPLANE BASE124
ULTRALIGHT32
STOLPORT28
GLIDERPORT6
fac_use
fieldValueweight
PR3,679
PU1,321
fed_agree
fieldValueweight
4,132
NGY404
N122
NGY354
NGPY47
NY138
125
NGPY324
N124
NGSY20
full_name
fieldValueweight
JOHNSON6
COMMUNITY MEMORIAL HOSPITAL5
CLARK4
FAIRVIEW4
ANDERSON4
MEMORIAL HOSPITAL4
SKYVIEW4
ST FRANCIS HOSPITAL4
EVANS3
VA MEDICAL CENTER3
id
fieldValueweight
100 to 197915,000
joint_use
fieldValueweight
3,762
N1,197
Y41
latitude
fieldValueweight
-14.21 to 70.415,000
longitude
fieldValueweight
-100.0 to 173.185,000
major
fieldValueweight
N4,933
Y67
mil_rts
fieldValueweight
3,743
Y701
N556
own_type
fieldValueweight
PR3,656
PU1,255
MR34
MA29
MN26
site_number
fieldValueweight
13646.02*H1
19173.9*H1
19051.1*A1
12402.5*A1
07735.01*H1
12401.6*A1
14525.*A1
26363.01*A1
18652.5*H1
03326.01*H1
state
fieldValueweight
TX488
CA219
IL215
PA209
FL204
OH194
IN163
AK157
NY155
MO146
fieldValueweight
5,000
[
  {
    "fieldName": "act_date",
    "values": [
      {
        "fieldValue": null,
        "weight": 3056
      },
      {
        "fieldValue": "04/1993",
        "weight": 35
      },
      {
        "fieldValue": "02/1987",
        "weight": 35
      },
      {
        "fieldValue": "03/1993",
        "weight": 34
      },
      {
        "fieldValue": "03/1988",
        "weight": 33
      },
      {
        "fieldValue": "06/1983",
        "weight": 32
      },
      {
        "fieldValue": "02/1982",
        "weight": 32
      },
      {
        "fieldValue": "11/1986",
        "weight": 31
      },
      {
        "fieldValue": "02/1998",
        "weight": 29
      },
      {
        "fieldValue": "08/1994",
        "weight": 29
      }
    ]
  },
  {
    "fieldName": "aero_cht",
    "values": [
      {
        "fieldValue": "NEW YORK",
        "weight": 414
      },
      {
        "fieldValue": "CHICAGO",
        "weight": 382
      },
      {
        "fieldValue": "DETROIT",
        "weight": 343
      },
      {
        "fieldValue": "DALLAS-FT WORTH",
        "weight": 242
      },
      {
        "fieldValue": "WASHINGTON",
        "weight": 221
      },
      {
        "fieldValue": "ST LOUIS",
        "weight": 199
      },
      {
        "fieldValue": "SEATTLE",
        "weight": 192
      },
      {
        "fieldValue": "HOUSTON",
        "weight": 191
      },
      {
        "fieldValue": "ATLANTA",
        "weight": 185
      },
      {
        "fieldValue": "TWIN CITIES",
        "weight": 183
      }
    ]
  },
  {
    "fieldName": "c_ldg_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 3842
      },
      {
        "fieldValue": "N",
        "weight": 1081
      },
      {
        "fieldValue": "Y",
        "weight": 77
      }
    ]
  },
  {
    "fieldName": "cbd_dir",
    "values": [
      {
        "fieldValue": "N",
        "weight": 959
      },
      {
        "fieldValue": "SW",
        "weight": 602
      },
      {
        "fieldValue": "NW",
        "weight": 598
      },
      {
        "fieldValue": "SE",
        "weight": 588
      },
      {
        "fieldValue": "S",
        "weight": 566
      },
      {
        "fieldValue": "NE",
        "weight": 564
      },
      {
        "fieldValue": "E",
        "weight": 545
      },
      {
        "fieldValue": "W",
        "weight": 519
      },
      {
        "fieldValue": null,
        "weight": 37
      },
      {
        "fieldValue": "ENE",
        "weight": 6
      }
    ]
  },
  {
    "fieldName": "cbd_dist",
    "values": [
      {
        "fieldValue": "0 to 63",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "cert",
    "values": [
      {
        "fieldValue": null,
        "weight": 4845
      },
      {
        "fieldValue": "CS 05/1973",
        "weight": 23
      },
      {
        "fieldValue": "AS 05/1973",
        "weight": 18
      },
      {
        "fieldValue": "BS 05/1973",
        "weight": 17
      },
      {
        "fieldValue": "LU 08/1990",
        "weight": 12
      },
      {
        "fieldValue": "DS 05/1973",
        "weight": 6
      },
      {
        "fieldValue": "LU 05/1973",
        "weight": 6
      },
      {
        "fieldValue": "AU 05/1973",
        "weight": 5
      },
      {
        "fieldValue": "LU 11/1974",
        "weight": 4
      },
      {
        "fieldValue": "LU 08/1976",
        "weight": 4
      }
    ]
  },
  {
    "fieldName": "city",
    "values": [
      {
        "fieldValue": "HOUSTON",
        "weight": 30
      },
      {
        "fieldValue": "SPRINGFIELD",
        "weight": 13
      },
      {
        "fieldValue": "PORTLAND",
        "weight": 11
      },
      {
        "fieldValue": "PHOENIX",
        "weight": 11
      },
      {
        "fieldValue": "FORT WORTH",
        "weight": 11
      },
      {
        "fieldValue": "COLUMBIA",
        "weight": 10
      },
      {
        "fieldValue": "FARMINGTON",
        "weight": 10
      },
      {
        "fieldValue": "LOS ANGELES",
        "weight": 10
      },
      {
        "fieldValue": "GREENVILLE",
        "weight": 10
      },
      {
        "fieldValue": "LEXINGTON",
        "weight": 10
      }
    ]
  },
  {
    "fieldName": "cntl_twr",
    "values": [
      {
        "fieldValue": "N",
        "weight": 4837
      },
      {
        "fieldValue": "Y",
        "weight": 163
      }
    ]
  },
  {
    "fieldName": "code",
    "values": [
      {
        "fieldValue": "11GA",
        "weight": 1
      },
      {
        "fieldValue": "7OK3",
        "weight": 1
      },
      {
        "fieldValue": "WAY",
        "weight": 1
      },
      {
        "fieldValue": "4IA0",
        "weight": 1
      },
      {
        "fieldValue": "AMW",
        "weight": 1
      },
      {
        "fieldValue": "43NC",
        "weight": 1
      },
      {
        "fieldValue": "7NY1",
        "weight": 1
      },
      {
        "fieldValue": "IS86",
        "weight": 1
      },
      {
        "fieldValue": "8FL5",
        "weight": 1
      },
      {
        "fieldValue": "LS07",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "county",
    "values": [
      {
        "fieldValue": "JEFFERSON",
        "weight": 55
      },
      {
        "fieldValue": "WASHINGTON",
        "weight": 52
      },
      {
        "fieldValue": "LOS ANGELES",
        "weight": 41
      },
      {
        "fieldValue": "MARION",
        "weight": 40
      },
      {
        "fieldValue": "JACKSON",
        "weight": 38
      },
      {
        "fieldValue": "MONTGOMERY",
        "weight": 33
      },
      {
        "fieldValue": "LAKE",
        "weight": 33
      },
      {
        "fieldValue": "HARRIS",
        "weight": 33
      },
      {
        "fieldValue": "FRANKLIN",
        "weight": 32
      },
      {
        "fieldValue": "MARICOPA",
        "weight": 29
      }
    ]
  },
  {
    "fieldName": "cust_intl",
    "values": [
      {
        "fieldValue": null,
        "weight": 3841
      },
      {
        "fieldValue": "N",
        "weight": 1137
      },
      {
        "fieldValue": "Y",
        "weight": 22
      }
    ]
  },
  {
    "fieldName": "elevation",
    "values": [
      {
        "fieldValue": "-113 to 9943",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "faa_dist",
    "values": [
      {
        "fieldValue": "NONE",
        "weight": 1768
      },
      {
        "fieldValue": "CHI",
        "weight": 376
      },
      {
        "fieldValue": "DET",
        "weight": 324
      },
      {
        "fieldValue": "SEA",
        "weight": 292
      },
      {
        "fieldValue": "MSP",
        "weight": 279
      },
      {
        "fieldValue": "ATL",
        "weight": 261
      },
      {
        "fieldValue": "NYC",
        "weight": 260
      },
      {
        "fieldValue": "HAR",
        "weight": 220
      },
      {
        "fieldValue": "ORL",
        "weight": 216
      },
      {
        "fieldValue": "DCA",
        "weight": 165
      }
    ]
  },
  {
    "fieldName": "faa_region",
    "values": [
      {
        "fieldValue": "AGL",
        "weight": 1141
      },
      {
        "fieldValue": "ASW",
        "weight": 845
      },
      {
        "fieldValue": "ASO",
        "weight": 728
      },
      {
        "fieldValue": "AEA",
        "weight": 674
      },
      {
        "fieldValue": "ANM",
        "weight": 529
      },
      {
        "fieldValue": "ACE",
        "weight": 380
      },
      {
        "fieldValue": "AWP",
        "weight": 345
      },
      {
        "fieldValue": "ANE",
        "weight": 195
      },
      {
        "fieldValue": "AAL",
        "weight": 157
      },
      {
        "fieldValue": null,
        "weight": 6
      }
    ]
  },
  {
    "fieldName": "fac_type",
    "values": [
      {
        "fieldValue": "AIRPORT",
        "weight": 3507
      },
      {
        "fieldValue": "HELIPORT",
        "weight": 1303
      },
      {
        "fieldValue": "SEAPLANE BASE",
        "weight": 124
      },
      {
        "fieldValue": "ULTRALIGHT",
        "weight": 32
      },
      {
        "fieldValue": "STOLPORT",
        "weight": 28
      },
      {
        "fieldValue": "GLIDERPORT",
        "weight": 6
      }
    ]
  },
  {
    "fieldName": "fac_use",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 3679
      },
      {
        "fieldValue": "PU",
        "weight": 1321
      }
    ]
  },
  {
    "fieldName": "fed_agree",
    "values": [
      {
        "fieldValue": null,
        "weight": 4132
      },
      {
        "fieldValue": "NGY",
        "weight": 404
      },
      {
        "fieldValue": "N",
        "weight": 122
      },
      {
        "fieldValue": "NGY3",
        "weight": 54
      },
      {
        "fieldValue": "NGPY",
        "weight": 47
      },
      {
        "fieldValue": "NY1",
        "weight": 38
      },
      {
        "fieldValue": "1",
        "weight": 25
      },
      {
        "fieldValue": "NGPY3",
        "weight": 24
      },
      {
        "fieldValue": "N1",
        "weight": 24
      },
      {
        "fieldValue": "NGSY",
        "weight": 20
      }
    ]
  },
  {
    "fieldName": "full_name",
    "values": [
      {
        "fieldValue": "JOHNSON",
        "weight": 6
      },
      {
        "fieldValue": "COMMUNITY MEMORIAL HOSPITAL",
        "weight": 5
      },
      {
        "fieldValue": "CLARK",
        "weight": 4
      },
      {
        "fieldValue": "FAIRVIEW",
        "weight": 4
      },
      {
        "fieldValue": "ANDERSON",
        "weight": 4
      },
      {
        "fieldValue": "MEMORIAL HOSPITAL",
        "weight": 4
      },
      {
        "fieldValue": "SKYVIEW",
        "weight": 4
      },
      {
        "fieldValue": "ST FRANCIS HOSPITAL",
        "weight": 4
      },
      {
        "fieldValue": "EVANS",
        "weight": 3
      },
      {
        "fieldValue": "VA MEDICAL CENTER",
        "weight": 3
      }
    ]
  },
  {
    "fieldName": "id",
    "values": [
      {
        "fieldValue": "100 to 19791",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "joint_use",
    "values": [
      {
        "fieldValue": null,
        "weight": 3762
      },
      {
        "fieldValue": "N",
        "weight": 1197
      },
      {
        "fieldValue": "Y",
        "weight": 41
      }
    ]
  },
  {
    "fieldName": "latitude",
    "values": [
      {
        "fieldValue": "-14.21 to 70.41",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "longitude",
    "values": [
      {
        "fieldValue": "-100.0 to 173.18",
        "weight": 5000
      }
    ]
  },
  {
    "fieldName": "major",
    "values": [
      {
        "fieldValue": "N",
        "weight": 4933
      },
      {
        "fieldValue": "Y",
        "weight": 67
      }
    ]
  },
  {
    "fieldName": "mil_rts",
    "values": [
      {
        "fieldValue": null,
        "weight": 3743
      },
      {
        "fieldValue": "Y",
        "weight": 701
      },
      {
        "fieldValue": "N",
        "weight": 556
      }
    ]
  },
  {
    "fieldName": "own_type",
    "values": [
      {
        "fieldValue": "PR",
        "weight": 3656
      },
      {
        "fieldValue": "PU",
        "weight": 1255
      },
      {
        "fieldValue": "MR",
        "weight": 34
      },
      {
        "fieldValue": "MA",
        "weight": 29
      },
      {
        "fieldValue": "MN",
        "weight": 26
      }
    ]
  },
  {
    "fieldName": "site_number",
    "values": [
      {
        "fieldValue": "13646.02*H",
        "weight": 1
      },
      {
        "fieldValue": "19173.9*H",
        "weight": 1
      },
      {
        "fieldValue": "19051.1*A",
        "weight": 1
      },
      {
        "fieldValue": "12402.5*A",
        "weight": 1
      },
      {
        "fieldValue": "07735.01*H",
        "weight": 1
      },
      {
        "fieldValue": "12401.6*A",
        "weight": 1
      },
      {
        "fieldValue": "14525.*A",
        "weight": 1
      },
      {
        "fieldValue": "26363.01*A",
        "weight": 1
      },
      {
        "fieldValue": "18652.5*H",
        "weight": 1
      },
      {
        "fieldValue": "03326.01*H",
        "weight": 1
      }
    ]
  },
  {
    "fieldName": "state",
    "values": [
      {
        "fieldValue": "TX",
        "weight": 488
      },
      {
        "fieldValue": "CA",
        "weight": 219
      },
      {
        "fieldValue": "IL",
        "weight": 215
      },
      {
        "fieldValue": "PA",
        "weight": 209
      },
      {
        "fieldValue": "FL",
        "weight": 204
      },
      {
        "fieldValue": "OH",
        "weight": 194
      },
      {
        "fieldValue": "IN",
        "weight": 163
      },
      {
        "fieldValue": "AK",
        "weight": 157
      },
      {
        "fieldValue": "NY",
        "weight": 155
      },
      {
        "fieldValue": "MO",
        "weight": 146
      }
    ]
  },
  {
    "fieldName": null,
    "values": [
      {
        "fieldValue": null,
        "weight": 5000
      }
    ]
  }
]
WITH __stage0 AS (
  SELECT * from (SELECT * FROM '../data/airports.parquet' USING SAMPLE 5000) as x limit 100000 )
, __stage1 AS (
  SELECT
    group_set,
    CASE group_set
      WHEN 0 THEN 'id'
      WHEN 1 THEN 'code'
      WHEN 2 THEN 'site_number'
      WHEN 3 THEN 'fac_type'
      WHEN 4 THEN 'fac_use'
      WHEN 5 THEN 'faa_region'
      WHEN 6 THEN 'faa_dist'
      WHEN 7 THEN 'city'
      WHEN 8 THEN 'county'
      WHEN 9 THEN 'state'
      WHEN 10 THEN 'full_name'
      WHEN 11 THEN 'own_type'
      WHEN 12 THEN 'longitude'
      WHEN 13 THEN 'latitude'
      WHEN 14 THEN 'elevation'
      WHEN 15 THEN 'aero_cht'
      WHEN 16 THEN 'cbd_dist'
      WHEN 17 THEN 'cbd_dir'
      WHEN 18 THEN 'act_date'
      WHEN 19 THEN 'cert'
      WHEN 20 THEN 'fed_agree'
      WHEN 21 THEN 'cust_intl'
      WHEN 22 THEN 'c_ldg_rts'
      WHEN 23 THEN 'joint_use'
      WHEN 24 THEN 'mil_rts'
      WHEN 25 THEN 'cntl_twr'
      WHEN 26 THEN 'major'
    END as "fieldName",  CASE group_set
      WHEN 0 THEN 'number'
      WHEN 1 THEN 'string'
      WHEN 2 THEN 'string'
      WHEN 3 THEN 'string'
      WHEN 4 THEN 'string'
      WHEN 5 THEN 'string'
      WHEN 6 THEN 'string'
      WHEN 7 THEN 'string'
      WHEN 8 THEN 'string'
      WHEN 9 THEN 'string'
      WHEN 10 THEN 'string'
      WHEN 11 THEN 'string'
      WHEN 12 THEN 'number'
      WHEN 13 THEN 'number'
      WHEN 14 THEN 'number'
      WHEN 15 THEN 'string'
      WHEN 16 THEN 'number'
      WHEN 17 THEN 'string'
      WHEN 18 THEN 'string'
      WHEN 19 THEN 'string'
      WHEN 20 THEN 'string'
      WHEN 21 THEN 'string'
      WHEN 22 THEN 'string'
      WHEN 23 THEN 'string'
      WHEN 24 THEN 'string'
      WHEN 25 THEN 'string'
      WHEN 26 THEN 'string'
    END as "fieldType",  CASE group_set WHEN 99999 THEN NULL
      WHEN 1 THEN base."code"
      WHEN 2 THEN base."site_number"
      WHEN 3 THEN base."fac_type"
      WHEN 4 THEN base."fac_use"
      WHEN 5 THEN base."faa_region"
      WHEN 6 THEN base."faa_dist"
      WHEN 7 THEN base."city"
      WHEN 8 THEN base."county"
      WHEN 9 THEN base."state"
      WHEN 10 THEN base."full_name"
      WHEN 11 THEN base."own_type"
      WHEN 15 THEN base."aero_cht"
      WHEN 17 THEN base."cbd_dir"
      WHEN 18 THEN base."act_date"
      WHEN 19 THEN base."cert"
      WHEN 20 THEN base."fed_agree"
      WHEN 21 THEN base."cust_intl"
      WHEN 22 THEN base."c_ldg_rts"
      WHEN 23 THEN base."joint_use"
      WHEN 24 THEN base."mil_rts"
      WHEN 25 THEN base."cntl_twr"
      WHEN 26 THEN base."major"
    END as "fieldValue",
   COUNT(*) as weight,
    CASE group_set
      WHEN 99999 THEN ''    WHEN 0 THEN MIN(CAST(base."id" as VARCHAR)) || ' to ' || CAST(MAX(base."id") as VARCHAR)
      WHEN 12 THEN MIN(CAST(base."longitude" as VARCHAR)) || ' to ' || CAST(MAX(base."longitude") as VARCHAR)
      WHEN 13 THEN MIN(CAST(base."latitude" as VARCHAR)) || ' to ' || CAST(MAX(base."latitude") as VARCHAR)
      WHEN 14 THEN MIN(CAST(base."elevation" as VARCHAR)) || ' to ' || CAST(MAX(base."elevation") as VARCHAR)
      WHEN 16 THEN MIN(CAST(base."cbd_dist" as VARCHAR)) || ' to ' || CAST(MAX(base."cbd_dist") as VARCHAR)
    END as "fieldRange"
  FROM __stage0 as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,27,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4
)
, __stage2 AS (
  SELECT
    "fieldName",
    "fieldType",
    COALESCE("fieldValue", "fieldRange") as "fieldValue",
    weight
  FROM __stage1
)
, __stage3 AS (
  SELECT
    group_set,
    base."fieldName" as "fieldName__0",
    CASE WHEN group_set=1 THEN
      base."fieldValue"
      END as "fieldValue__1",
    CASE WHEN group_set=1 THEN
      base."weight"
      END as "weight__1"
  FROM __stage2 as base
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,1,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4
)
SELECT
  "fieldName__0" as "fieldName",
  COALESCE(LIST({
    "fieldValue": "fieldValue__1", 
    "weight": "weight__1"}  ORDER BY  "weight__1" desc NULLS LAST) FILTER (WHERE group_set=1)[1:10],[]) as "values"
FROM __stage3
GROUP BY 1
ORDER BY 1 ASC NULLS LAST

A More Complex Example

The rest of this pages uses the model below. The data is an excerpt from the IMDB. The Malloy schema for this model is shown on the right. The core value is movies, but joined at the principals (the people that worked on the movie) and the people (the actual data about the individuals).

We use the measure total_ratings to determin a movie's popularity. An individual's popularity is determined by the some of all the ratings of the movies a person has worked on.

document
source: movies is duckdb.table('../data/titles.parquet') extend {
  join_many: principals is duckdb.table('../data/principals.parquet') extend {
    join_one: people is duckdb.table('../data/names.parquet') 
      on nconst = people.nconst
  } on tconst = principals.tconst
  measure: total_ratings is numVotes.sum()
}

Weights can be any measure

Often a row count will work nicely as a weight, but sometimes there is something better. In movies, for example the sum of the number of votes will not only find the interesting most interesting movies but will also find the most interesting people. For example the most interesting people in the dataset.

document
run: movies -> {
  group_by: principals.people.primaryName
  aggregate: total_ratings
} 
QUERY RESULTS
primaryNametotal_​ratings
Brad Pitt18,269,386
Stan Lee18,229,893
John Williams17,651,379
Leonardo DiCaprio17,032,662
Tom Hanks16,650,835
Christopher Nolan15,368,415
Steven Spielberg15,069,082
Robert De Niro14,629,861
Christian Bale13,730,681
Samuel L. Jackson13,423,134
Robert Downey Jr.12,685,331
George Lucas12,584,331
Matt Damon12,509,393
Quentin Tarantino12,348,394
Bruce Willis12,045,753
Jack Kirby11,901,244
Johnny Depp11,755,228
Morgan Freeman11,435,905
Tom Cruise11,290,573
Thomas Newman11,056,371
Hans Zimmer10,957,053
Scott Rudin10,793,692
Charles Roven10,778,794
Bob Kane10,685,405
Ian McKellen10,552,024
Chris Evans10,497,785
Harrison Ford10,449,365
Scarlett Johansson10,128,373
David S. Goyer10,022,517
Jonathan Nolan9,823,211
Martin Scorsese9,692,050
Peter Jackson9,612,306
Orlando Bloom9,482,690
James Newton Howard9,473,293
Stephen King9,454,020
Hugh Jackman9,409,702
Kevin Feige9,408,976
Ridley Scott9,334,292
Natalie Portman9,206,903
Liam Neeson9,185,450
David Heyman9,174,472
Mark Ruffalo9,171,115
Bradley Cooper8,980,668
Will Smith8,960,562
Al Pacino8,933,776
Emma Thomas8,773,124
Tim Bevan8,756,327
Roger Deakins8,737,852
Fran Walsh8,594,296
Michael Caine8,533,028
Lorne Orleans8,518,672
Lawrence Bender8,503,500
Keanu Reeves8,480,149
David Fincher8,427,910
Mark Wahlberg8,393,193
Joel Silver8,360,819
Robert Richardson8,351,268
Philippa Boyens8,336,233
Kathleen Kennedy8,278,785
Matthew McConaughey8,271,368
Ben Affleck8,268,036
James Horner8,253,828
Alan Silvestri8,072,953
Chris Hemsworth8,058,188
Carter Burwell7,997,536
Simon Kinberg7,846,138
Eric Fellner7,845,245
J.R.R. Tolkien7,798,667
Jake Gyllenhaal7,762,048
Jerry Bruckheimer7,721,877
Edward Norton7,713,816
Woody Harrelson7,711,063
Ewan McGregor7,655,458
George Clooney7,653,368
David Benioff7,637,309
Kevin Spacey7,603,094
Brian Grazer7,570,625
Anne Hathaway7,509,220
Arnon Milchan7,494,607
Jim Carrey7,421,247
Robert Zemeckis7,371,523
Danny Elfman7,340,555
Ryan Reynolds7,319,537
Jason Blum7,308,653
James Cameron7,306,015
Sally Menke7,239,769
Tom Hardy7,213,428
Clint Eastwood7,209,434
Russell Crowe7,166,573
Akiva Goldsman7,094,797
Nicolas Cage7,053,952
Michael Giacchino7,047,025
Gary Oldman7,036,991
Cate Blanchett7,017,425
Michael Bay7,016,200
J.J. Abrams6,991,513
Willem Dafoe6,924,815
Christopher Markus6,886,318
Stephen McFeely6,886,318
Neal H. Moritz6,870,329
Alexandre Desplat6,867,309
Jennifer Lawrence6,855,625
Rachel McAdams6,777,091
Lena Headey6,742,472
Elijah Wood6,734,358
Dwayne Johnson6,584,416
Jason Statham6,582,293
Steve Kloves6,559,963
Denzel Washington6,556,554
Sigourney Weaver6,515,096
Arnold Schwarzenegger6,482,925
Tim Burton6,467,281
Steve Carell6,436,581
Terry Rossio6,423,369
Ceán Chaffin6,415,865
Amy Adams6,401,556
Daniel Radcliffe6,393,663
J.K. Rowling6,392,020
Mel Gibson6,322,650
Anthony Hopkins6,315,913
Emma Watson6,304,351
6,300,267
Rick McCallum6,211,607
Jack Nicholson6,180,622
Ted Elliott6,136,354
Adam Sandler6,094,935
Ed Harris6,077,615
Chris Columbus6,076,896
Cameron Diaz6,042,333
Joseph Gordon-Levitt6,034,397
Emilia Clarke5,995,245
D.B. Weiss5,972,128
Joaquin Phoenix5,967,875
Daniel Craig5,967,704
Luc Besson5,961,182
Ryan Gosling5,956,376
Vin Diesel5,944,056
Andrew Stanton5,942,167
John Goodman5,940,986
Ralph Fiennes5,906,715
Peter Dinklage5,906,118
David Koepp5,894,926
Pete Docter5,862,141
Sylvester Stallone5,861,981
Angelina Jolie5,857,481
Ian Bryce5,848,379
Hugo Weaving5,841,890
Frank Darabont5,826,908
Ben Kingsley5,825,798
Viggo Mortensen5,823,719
Richard Francis-Bruce5,798,818
Colin Farrell5,788,835
Charlize Theron5,765,794
Joss Whedon5,757,097
Nikolaj Coster-Waldau5,749,486
Julianne Moore5,744,632
Jonah Hill5,718,674
Joel Coen5,700,458
Zack Snyder5,695,090
Ben Stiller5,685,709
Bryan Cranston5,685,238
Kate Winslet5,682,918
Ethan Coen5,663,547
Walter F. Parkes5,657,162
Howard Shore5,638,214
Jeremy Renner5,623,966
Marco Beltrami5,590,317
Bryan Singer5,590,237
Emma Stone5,583,422
Guillermo del Toro5,570,240
Owen Wilson5,556,539
Rupert Grint5,552,590
Jude Law5,538,884
Matthew Vaughn5,524,297
Alex Kurtzman5,522,344
Lana Wachowski5,518,169
Keira Knightley5,512,152
Lilly Wachowski5,444,217
Judd Apatow5,443,590
Chris Pratt5,439,043
Francis Ford Coppola5,436,313
Will Ferrell5,420,626
Seth Rogen5,419,570
Sandra Bullock5,405,221
Steve Ditko5,371,488
Frank Marshall5,363,210
John Logan5,353,165
Ken Watanabe5,352,344
Sam Raimi5,334,765
Carrie-Anne Moss5,305,827
Benedict Cumberbatch5,273,615
Robin Williams5,224,218
Jennifer Connelly5,222,568
Ron Howard5,215,903
John Travolta5,206,564
Laurence Fishburne5,187,514
Ethan Hawke5,186,955
Jamie Foxx5,180,837
Lawrence Kasdan5,173,484
Gerard Butler5,156,049
[
  {
    "primaryName": "Brad Pitt",
    "total_ratings": 18269386
  },
  {
    "primaryName": "Stan Lee",
    "total_ratings": 18229893
  },
  {
    "primaryName": "John Williams",
    "total_ratings": 17651379
  },
  {
    "primaryName": "Leonardo DiCaprio",
    "total_ratings": 17032662
  },
  {
    "primaryName": "Tom Hanks",
    "total_ratings": 16650835
  },
  {
    "primaryName": "Christopher Nolan",
    "total_ratings": 15368415
  },
  {
    "primaryName": "Steven Spielberg",
    "total_ratings": 15069082
  },
  {
    "primaryName": "Robert De Niro",
    "total_ratings": 14629861
  },
  {
    "primaryName": "Christian Bale",
    "total_ratings": 13730681
  },
  {
    "primaryName": "Samuel L. Jackson",
    "total_ratings": 13423134
  },
  {
    "primaryName": "Robert Downey Jr.",
    "total_ratings": 12685331
  },
  {
    "primaryName": "George Lucas",
    "total_ratings": 12584331
  },
  {
    "primaryName": "Matt Damon",
    "total_ratings": 12509393
  },
  {
    "primaryName": "Quentin Tarantino",
    "total_ratings": 12348394
  },
  {
    "primaryName": "Bruce Willis",
    "total_ratings": 12045753
  },
  {
    "primaryName": "Jack Kirby",
    "total_ratings": 11901244
  },
  {
    "primaryName": "Johnny Depp",
    "total_ratings": 11755228
  },
  {
    "primaryName": "Morgan Freeman",
    "total_ratings": 11435905
  },
  {
    "primaryName": "Tom Cruise",
    "total_ratings": 11290573
  },
  {
    "primaryName": "Thomas Newman",
    "total_ratings": 11056371
  },
  {
    "primaryName": "Hans Zimmer",
    "total_ratings": 10957053
  },
  {
    "primaryName": "Scott Rudin",
    "total_ratings": 10793692
  },
  {
    "primaryName": "Charles Roven",
    "total_ratings": 10778794
  },
  {
    "primaryName": "Bob Kane",
    "total_ratings": 10685405
  },
  {
    "primaryName": "Ian McKellen",
    "total_ratings": 10552024
  },
  {
    "primaryName": "Chris Evans",
    "total_ratings": 10497785
  },
  {
    "primaryName": "Harrison Ford",
    "total_ratings": 10449365
  },
  {
    "primaryName": "Scarlett Johansson",
    "total_ratings": 10128373
  },
  {
    "primaryName": "David S. Goyer",
    "total_ratings": 10022517
  },
  {
    "primaryName": "Jonathan Nolan",
    "total_ratings": 9823211
  },
  {
    "primaryName": "Martin Scorsese",
    "total_ratings": 9692050
  },
  {
    "primaryName": "Peter Jackson",
    "total_ratings": 9612306
  },
  {
    "primaryName": "Orlando Bloom",
    "total_ratings": 9482690
  },
  {
    "primaryName": "James Newton Howard",
    "total_ratings": 9473293
  },
  {
    "primaryName": "Stephen King",
    "total_ratings": 9454020
  },
  {
    "primaryName": "Hugh Jackman",
    "total_ratings": 9409702
  },
  {
    "primaryName": "Kevin Feige",
    "total_ratings": 9408976
  },
  {
    "primaryName": "Ridley Scott",
    "total_ratings": 9334292
  },
  {
    "primaryName": "Natalie Portman",
    "total_ratings": 9206903
  },
  {
    "primaryName": "Liam Neeson",
    "total_ratings": 9185450
  },
  {
    "primaryName": "David Heyman",
    "total_ratings": 9174472
  },
  {
    "primaryName": "Mark Ruffalo",
    "total_ratings": 9171115
  },
  {
    "primaryName": "Bradley Cooper",
    "total_ratings": 8980668
  },
  {
    "primaryName": "Will Smith",
    "total_ratings": 8960562
  },
  {
    "primaryName": "Al Pacino",
    "total_ratings": 8933776
  },
  {
    "primaryName": "Emma Thomas",
    "total_ratings": 8773124
  },
  {
    "primaryName": "Tim Bevan",
    "total_ratings": 8756327
  },
  {
    "primaryName": "Roger Deakins",
    "total_ratings": 8737852
  },
  {
    "primaryName": "Fran Walsh",
    "total_ratings": 8594296
  },
  {
    "primaryName": "Michael Caine",
    "total_ratings": 8533028
  },
  {
    "primaryName": "Lorne Orleans",
    "total_ratings": 8518672
  },
  {
    "primaryName": "Lawrence Bender",
    "total_ratings": 8503500
  },
  {
    "primaryName": "Keanu Reeves",
    "total_ratings": 8480149
  },
  {
    "primaryName": "David Fincher",
    "total_ratings": 8427910
  },
  {
    "primaryName": "Mark Wahlberg",
    "total_ratings": 8393193
  },
  {
    "primaryName": "Joel Silver",
    "total_ratings": 8360819
  },
  {
    "primaryName": "Robert Richardson",
    "total_ratings": 8351268
  },
  {
    "primaryName": "Philippa Boyens",
    "total_ratings": 8336233
  },
  {
    "primaryName": "Kathleen Kennedy",
    "total_ratings": 8278785
  },
  {
    "primaryName": "Matthew McConaughey",
    "total_ratings": 8271368
  },
  {
    "primaryName": "Ben Affleck",
    "total_ratings": 8268036
  },
  {
    "primaryName": "James Horner",
    "total_ratings": 8253828
  },
  {
    "primaryName": "Alan Silvestri",
    "total_ratings": 8072953
  },
  {
    "primaryName": "Chris Hemsworth",
<