Table of Contents
Most of the time, I'm using the SAT Verbal hemisphere, but I also like exercising the SAT Math side. Here's my first beyond Hello, World!
effort in the Julia
programming language. For most of the last 17 years, I've been using R
, which displaced Python
for most purposes. R
has a wealth of statistical treasures and an impeccable pedigree.
But it is a crappy procedural language and it's ducktyped. It is functional as it presents in the REPL and that is a great advantage to the procedural/imperative style of C
and its progeny. But unlike Haskell
, which is also a functional language, it's not strongly typed. That means that objects tend to get composed of linked lists and can be tedious to navigate. But Haskell
is kinda geeky. When I discovered Julia I was smote.
The problem that spurred me here is this: A warehouse in England ships pallets to various postcodes (the Brits are crazy, they have like 1.7 million ZIP+5 equivalents) and are charged by the pallet with discounts given for multiple pallets upto a standard lorryful of 26 pallets. How can they tell if they are being billed by the carrier correctly?
How, indeed, using spreadsheets? Thousands of shipments to scores of addresses in various load sizes. Wasn't gonna happen. So, here's what I came up with.
# audit.jl
# compare invoiced shipping costs to UK postcodes
# given number of pallets against the rate card
# provided by carrier; returns 0 if no differences
# otherwise use `truth_table` vector to subsect
# DataFrame `obj`
# author: Richard Careaga
# Date: 2024-02-28
#------------------------------------------------------------------
# libraries
using CSV
using DataFrames
using Dates
#------------------------------------------------------------------
# constant
objs = "/Users/me/postal/objs/"
#------------------------------------------------------------------
# data
# single origin shipments to separate destinations
df = CSV.read(objs * "/PE330JF.csv",DataFrame)
# =
5_000 records
```markdown
| variable | eltype |
|-------------|-----------|
| Symbol | DataType |
| dt | DateTime |
| from | String15 |
| to | String15 |
| pallets | Int64 |
| cost | Float64 |
| key | String7 |
| origin | String7 |
| destination | String15 |
| meters | Float64 |
| miles | Float64 |
```
=#
# corresponding charges per pallet
card = CSV.read(objs * "rate_card.csv", DataFrame, header = false)
#=
100 records
```markdown
| variable | eltype |
|----------|---------|
| Symbol | DataType|
| Column1 | String3 |
| Column2 | Float64 |
| Column3 | Float64 |
⋮ ⋮
| Column27 | Float64 |
```
=#
#------------------------------------------------------------------
# preprocessing
df.dt = DateTime.(df.dt, "m/d/yy H:M") + Year(2000)
# Create a new column to hold the first two letters of the 'key' column
# the keep only the columns for key, number of pallets and cost
df[!, :key_group] = first.(df.key, 2)
grp = unique(df[:,[:key_group,:pallets,:cost]])
rename!(grp,[:key,:pallets,:cost])
sort!(grp, :key)
# discard unused rows from rate card
ours = unique(df.key_group)
theirs = card.Column1
common = intersect(ours,theirs)
underlap = setdiff(ours,theirs)
short = filter(row -> in(row.Column1, common), card)
new_names = Symbol[:key; [Symbol("p", i) for i in 1:26]]
short = DataFrame(short[!, 1:27], new_names)
# principal object
obj = leftjoin(grp,short, on = :key)
obj = obj[completecases(obj),:]
# Generate column names to compare against 'cost'
# prefix the value of the pallets column with the letter p
# to correspond to p1 … p26, the cost for a given number of pallets
obj.p_column = "p" .* string.(obj.pallets)
#------------------------------------------------------------------
# main
# Perform comparison and create truth table
truth_table = [obj[i, :cost] != obj[i, Symbol(obj[i, :p_column])] for i in 1:size(obj, 1)]
# Result will equal 0 if all costs correspond to the rate card for that
# number of pallets
sum(truth_table)
I cheated. I had the help of Claude, an AI bot who answered my simple questions about syntax and greatly assisted in the real work, which is properly framing the question systematically. When you go back to school algebra and start thinking $y = f(x)$ pieces start falling into place easily.
$x$ is what is at hand, $y$ is what is desired and $f$ is the operator or script that will get you there. Any of these may be, and in fact usually are, composite. So you start doing things like $y = g(f(x)$, etc. Approaching it that way is like having a conversation:
$y$, darling, tell me what you need to be happy
$x$, sweetheart, do you think I could give nicknames to your beautiful eyes?
⋮
so, $f$ if I make this little change can you get me closer?
Far too much formal education is wasted trying to find the answer, which is too bad. Answers change all the time, but the good questions are durable.