TODO
I’m afraid I need to rework my approach again. I figured out that the initial connection timeout is very important way to get rid of a lot of rate limits. Because of this, my approach of sending ranges in batches of 1 per log server isn’t holding up great. This is causing the loop to wait for the connection timeout fully until it processes a new batch. Instead I should
- Create a proper RateLimit class, similar to the rust sdk interface
- Tweak rate limits and connection timeouts find the optimal balance per server.
- Provide the scan log servers with a
HashMap<LogRange, Vec<Ranges>> - Once a range for a particular log server is completed, pluck another one from the HashMap
- If a range fails to be completed in the rate limit timeframe, move to the next range and leave the range as Pending in the database.
- See if we can store the logserver optimal rate limits and preferred range sizes in the database. We could base the latter on the average entry count that a log server returns. so servers with 1024 entries per request can use larger range counts than servers that return lower count like 32. Something like
LOGSERVER#<url> SK: PROPERTIES.
#![allow(unused)]
fn main() {
struct RetryConfig {
max_retries: usize = 3
initial_backoff: Duration = 3
max_backoff: Range<Duration> = [20..22] // range to randomize
exponential_backoff: bool = true
step: Range<Duration> = [1..2] // With exponential, we will increase this exponentially, otherwise we will do this linearly. We use a range to randomize
}
struct LogServer {
retry_config: RetryConfig
average_entry_size: u16 = 1024
url: String
mirror: Optional<String>
}
struct CertTransparencyClient {
log_server: LogServer,
retry_config: RetryConfig,
max_range_failure: usize = 3 // maximum amount of time a range can fail until we completely want to stop trying
}
impl CertTransparencyClient {
async fn get_entries(range) {
// Retrieve entries
// retrieve entries
// if entries = 0, retry
// if entries != 0, reset max_retries
// if MaxRangeRetries, return error. user of the certclient should skip this range and try another range
// if MaxLogServerRetries, return error. of of the certclient should abandon this log server completely
// if range completed, return all entries that are retrieved OR
// Do we want to handle the name extraction here already? Maybe this code above should be the function get_names, and get_entries should return RateLimited and connection errors errors. Then this function will parse those errors and either, retry, raise MaxLogServerRetries or MaxRangeRetries, or return the entries.
}
}
}