Malloy Documentation
search

All expressions in Malloy have an "evaluation space" which can be one of four values: literal, constant, input, and output.

Functions may have constraints on what evaluation space particular arguments can be. For example:

  • In avg_moving(expr, preceding), preceding must be a literal number.

  • In lag(expr), expr must be an output value.

  • In avg(expr), expr must be an input value.

  • In lag(expr, offset, default), default must be a constant value.

Literals

Literal expressions are any literal value that you can write in Malloy. These include literals of type number, boolean, string, date, timestamp, and even regular expressions. The following are all literals: 12, true, 'hello world', @2003, @2011-11-11 11:11:11, r'.*'.

Some functions require that particular arguments be literals. For example, avg_moving(expr, preceding) required that preceding be a literal number.

document
run: flights -> {
  group_by: carrier
  aggregate: flight_count
  // Second argument must be a literal
  calculate: rolling_avg is avg_moving(flight_count, 3)
}
QUERY RESULTS
carrierflight_​countrolling_​avg
WN88,75188,751
US37,68363,217
AA34,57753,670.333
NW33,58048,647.75
UA32,75734,649.25
[
  {
    "carrier": "WN",
    "flight_count": 88751,
    "rolling_avg": 88751
  },
  {
    "carrier": "US",
    "flight_count": 37683,
    "rolling_avg": 63217
  },
  {
    "carrier": "AA",
    "flight_count": 34577,
    "rolling_avg": 53670.333333333336
  },
  {
    "carrier": "NW",
    "flight_count": 33580,
    "rolling_avg": 48647.75
  },
  {
    "carrier": "UA",
    "flight_count": 32757,
    "rolling_avg": 34649.25
  }
]
SELECT 
   flights."carrier" as "carrier",
   COUNT( 1) as "flight_count",
   AVG((COUNT( 1))) OVER(  ORDER BY  COUNT( 1) desc NULLS LAST ROWS BETWEEN 3 PRECEDING AND 0 FOLLOWING) as "rolling_avg"
FROM '../data/flights.parquet' as flights
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Constants

Constant expressions are produced by operating on literals, e.g. 1 + 1 etc.

Some functions require that particular arguments be constants. For example, the default argument to lag must be constant:

document
run: flights -> {
  group_by: carrier
  aggregate: flight_count
  // Third argument must be a constant (or literal)
  calculate: prev_carrier1 is lag(carrier, 1, concat('NO', ' ', 'VALUE'))
  calculate: prev_carrier2 is lag(carrier, 1, 'NO VALUE')
}
QUERY RESULTS
carrierflight_​countprev_​carrier1prev_​carrier2
WN88,751NO VALUENO VALUE
US37,683WNWN
AA34,577USUS
NW33,580AAAA
UA32,757NWNW
[
  {
    "carrier": "WN",
    "flight_count": 88751,
    "prev_carrier1": "NO VALUE",
    "prev_carrier2": "NO VALUE"
  },
  {
    "carrier": "US",
    "flight_count": 37683,
    "prev_carrier1": "WN",
    "prev_carrier2": "WN"
  },
  {
    "carrier": "AA",
    "flight_count": 34577,
    "prev_carrier1": "US",
    "prev_carrier2": "US"
  },
  {
    "carrier": "NW",
    "flight_count": 33580,
    "prev_carrier1": "AA",
    "prev_carrier2": "AA"
  },
  {
    "carrier": "UA",
    "flight_count": 32757,
    "prev_carrier1": "NW",
    "prev_carrier2": "NW"
  }
]
SELECT 
   flights."carrier" as "carrier",
   COUNT( 1) as "flight_count",
   LAG((flights."carrier"), 1, CONCAT('NO',' ','VALUE')) OVER(  ORDER BY  COUNT( 1) desc NULLS LAST ) as "prev_carrier1",
   LAG((flights."carrier"), 1, 'NO VALUE') OVER(  ORDER BY  COUNT( 1) desc NULLS LAST ) as "prev_carrier2"
FROM '../data/flights.parquet' as flights
GROUP BY 1
ORDER BY 2 desc NULLS LAST

Currently, when constants are saved as dimensions, they become input fields, even though they are known to be constants. This may change in the future.

Inputs

Input expressions are those that reference columns in a table or dimensions defined in a source. They represent data that exists inside the source table or that can be computed directly from a particular row of data.

document
run: flights -> {
  group_by: carrier
}
QUERY RESULTS
carrier
AA
AS
B6
CO
DL
[
  {
    "carrier": "AA"
  },
  {
    "carrier": "AS"
  },
  {
    "carrier": "B6"
  },
  {
    "carrier": "CO"
  },
  {
    "carrier": "DL"
  }
]
SELECT 
   flights."carrier" as "carrier"
FROM '../data/flights.parquet' as flights
GROUP BY 1
ORDER BY 1 asc NULLS LAST

In the above query, carrier is an input expression (in particular, an input field).

Outputs

Output expressions are those which reference columns in the output of a query or aggregate values. They represent values which can be produced from the source table. Dimensions are only output values when they are included in the query. Aggregate values are always output values, because they don't need to be included in the result table to be known.

Dimensional Outputs

Any time a field is included in a query, it creates an output field with the same name:

document
run: flights -> {
  group_by: 
    carrier
    new_field is 1
  aggregate: flight_count
}
QUERY RESULTS
carriernew_​fieldflight_​count
WN188,751
US137,683
AA134,577
NW133,580
UA132,757
[
  {
    "carrier": "WN",
    "new_field": 1,
    "flight_count": 88751
  },
  {
    "carrier": "US",
    "new_field": 1,
    "flight_count": 37683
  },
  {
    "carrier": "AA",
    "new_field": 1,
    "flight_count": 34577
  },
  {
    "carrier": "NW",
    "new_field": 1,
    "flight_count": 33580
  },
  {
    "carrier": "UA",
    "new_field": 1,
    "flight_count": 32757
  }
]
SELECT 
   flights."carrier" as "carrier",
   1 as "new_field",
   COUNT( 1) as "flight_count"
FROM '../data/flights.parquet' as flights
GROUP BY 1,2
ORDER BY 3 desc NULLS LAST

The above query defines carrier, new_field, and flight_count as output fields of the query.

When inside a calculate: statement, field references will by default use the output field with that name. There is often an input field with the same name, and in that case the output field will take precedence. In the following query, group_by: carrier creates an output field called carrier. In lag(carrier), carrier refers to that output field (rather than the input field carrier defined in flights) because it appears inside a calculate: statement.

document
run: flights -> {
  group_by: carrier
  calculate: prev_carrier is lag(carrier)
}
QUERY RESULTS
carrierprev_​carrier
AA
ASAA
B6AS
COB6
DLCO
[
  {
    "carrier": "AA",
    "prev_carrier": null
  },
  {
    "carrier": "AS",
    "prev_carrier": "AA"
  },
  {
    "carrier": "B6",
    "prev_carrier": "AS"
  },
  {
    "carrier": "CO",
    "prev_carrier": "B6"
  },
  {
    "carrier": "DL",
    "prev_carrier": "CO"
  }
]
SELECT 
   flights."carrier" as "carrier",
   LAG((flights."carrier")) OVER(  ORDER BY  flights."carrier" asc NULLS LAST ) as "prev_carrier"
FROM '../data/flights.parquet' as flights
GROUP BY 1
ORDER BY 1 asc NULLS LAST

In this next query, group_by: output_carrier is carrier creates an output field called output_carrier, which is referenced in the subsequent line.

document
run: flights -> {
  group_by: output_carrier is carrier
  calculate: prev_carrier is lag(output_carrier)
}
QUERY RESULTS
output_​carrierprev_​carrier
AA
ASAA
B6AS
COB6
DLCO
[
  {
    "output_carrier": "AA",
    "prev_carrier": null
  },
  {
    "output_carrier": "AS",
    "prev_carrier": "AA"
  },
  {
    "output_carrier": "B6",
    "prev_carrier": "AS"
  },
  {
    "output_carrier": "CO",
    "prev_carrier": "B6"
  },
  {
    "output_carrier": "DL",
    "prev_carrier": "CO"
  }
]
SELECT 
   flights."carrier" as "output_carrier",
   LAG((flights."carrier")) OVER(  ORDER BY  flights."carrier" asc NULLS LAST ) as "prev_carrier"
FROM '../data/flights.parquet' as flights
GROUP BY 1
ORDER BY 1 asc NULLS LAST

Here, calculate: prev_carrier is lag(carrier) would result in an error, because carrier is an input expression, and lag requires that its first argument be an output expression.

Aggregate Outputs

An aggregate value is always considered to be an output expression, even when it is a direct reference to a measure.

document
run: flights -> {
  group_by: carrier
  calculate: prev_carrier_flight_count is lag(flight_count)
}
QUERY RESULTS
carrierprev_​carrier_​flight_​count
AA
AS34,577
B68,453
CO4,842
DL7,139
[
  {
    "carrier": "AA",
    "prev_carrier_flight_count": null
  },
  {
    "carrier": "AS",
    "prev_carrier_flight_count": 34577
  },
  {
    "carrier": "B6",
    "prev_carrier_flight_count": 8453
  },
  {
    "carrier": "CO",
    "prev_carrier_flight_count": 4842
  },
  {
    "carrier": "DL",
    "prev_carrier_flight_count": 7139
  }
]
SELECT 
   flights."carrier" as "carrier",
   LAG((COUNT( 1))) OVER(  ORDER BY  flights."carrier" asc NULLS LAST ) as "prev_carrier_flight_count"
FROM '../data/flights.parquet' as flights
GROUP BY 1
ORDER BY 1 asc NULLS LAST