porting kele/hand to Rust

2024-01-22

word count: 675

approx reading time: 3 mins

while reading issue #3 of Paged Out! i came across a fun looking library, the code of which was fully included in the single page dedicated to it. called hand, it allows you to help out your program by manually processing edge-cases that would be annoying or too involved to handle in code.

often, when processing large amounts of data, most of it will follow a certain format, save for a few edge-cases which will use very disparate formats. writing code to deal with all of these can be very time consuming, while you, a human, can take a single look at each edge-case and fix it. with hand, you wrap your processing function with a helper function that will prompt you (through stdout/stdin) to enter the correct value any time the processing function fails. your program can then continue execution using the value you provided.

i thought this was a very interesting concept, and since i do a fair bit of data processing, i wanted to have a Rust version of this library i could use. with generics, the Result type, and things such as serde, Rust is a very good fit.

the outcome is helping-hand! i tried to keep the API very similar to the original, providing a function called help_with that wraps around your function. for example, given the following processing function:

fn parse_num(s: &str) -> Result<u8, ParseIntError> {
    s.parse()
}

your code changes from:

for num in ["1", "2", "three", "4"] {
    let result = parse_num(num).unwrap();
    println!("{result}");
}

to:

for num in ["1", "2", "three", "4"] {
    let result = help_with(parse_num)(num).unwrap();
    println!("{result}");
}

running the above program will output the following text, then stop, waiting for input:

1
2
Error with input "three": ParseIntError { kind: InvalidDigit }
Fix:

here, our parse_num function returned an error when provided with the input "three", so help_with stops the execution and tells us about the error, asking us to fix the value. typing 3 and pressing Ctrl+D will submit the value, and continue evaluation from the point where it stopped. this means that the entire output for the program ends up being:

1
2
Error with input "three": ParseIntError { kind: InvalidDigit }
Fix: 3
3
4

helping-hand uses a type's serde::Deserialize implementation (with serde_json), which means it still works with non-primitive types, by allowing you to write JSON when prompted. so, if your processing function were to instead look like:

#[derive(Deserialize)]
struct Repeat {
    string: String,
    times: usize,
}

fn parse(s: &str) -> Result<Repeat, &'static str> {
    let vec = s.split(',').collect::<Vec<_>>();

    let [string, times] = vec[..] else {
        return Err("incorrect number of commas");
    };

    let string = string.to_string();
    let times = times.parse().map_err(|_| "error parsing number")?;

    Ok(Repeat { string, times })
}

you would still be able to input the value:

Error with input "there is no comma here!; 4": "incorrect number of commas"
Fix: {"string": "there is no comma here!", "times": 4}

JSON might not be your preferred data serialization format, but thankfully serde's flexibility would allow us to easily swap this for some other format. at the time of writing i haven't implemented anything to customize what format is used, but it would be really simple to add.

that's the basics of helping-hand! my main use case is for processing OSM (and other geography related) data, which is mostly always in a standard format, but sometimes has some hard to deal with edge-cases.

an aside: on repls and notebooks

this philosophy of working in concert with your program is what repls and notebooks (a la jupyter) provide, which is sorely missed in compiled languages such as Rust. humans can do things no machine can do, and so having one as a cog on your machine can bring big advantages.

after some research, i have found that Rust does have some options for REPLs, such as evcxr or IRust. i haven't had the time to test any of them yet, but they seem very promising. being able to play with your code in real time without having to rerun the program from scratch is very valuable, since it allows for quick prototyping and iteration. if they work as well as i hope they do, they're going to become a very useful tool in my arsenal.

conclusion

this crate provides the opposite of automation, which is a fun concept to play around with. sometimes you are in the top right quadrant of the automation quadrant, which means it's not really worth to spend the time automating something.

so, should you use helping-hand? probably not. it's got very limited use case, since in most cases it's easier to either deal with the edge-cases in code, or to manually edit the source data so that it conforms to the rest. but adding human-based recoverability to programs is just fun, so that by itself makes this crate worth existing.