- We want to help academia/industry improve algorithms with realistic/practical data, this will help in driving appropriate benchmarks
- We want to identify the challenging applications to be able to drive the collection/reuse of specific datasets
- Identify applications
- Identify the data sets
- Identify the current shortcomings/what is missing/what needs work on (How to handle augmentation/negative cases)
- Can we increase datasets to make them a more realistic production quality level (rather than just a small toy example)? Note the intent here is to have at least a full specific example (to reflect the complexity, not necessarily to make it possible for someone to make an actual product out of it).
- Can we break down a large dataset to make easy/medium/hard subsets of data or subsets of output classes?
- How compressible can we make the problem? (this is crossing into Benchmark space)
- What are relevant benchmark topologies/criteria for the application?
- Share best practices for community and industry to move forward (not to be stuck in out-of-date benchmarks)
- Need to identify the right cadence of changing benchmarks (to support above, but not make it impossible to compare vendors to one another because benchmarks change). MLPerf is reviewing once per year.
- Identify standards for re-use between different NNs (eg standardize input resolution/output classes, so that you can plug and play different NNs for running on same data)
- How do we deal with Network Architecture Search optimization possibilities? How to compare what different users can do? (Eg open category in MLPerf)
Recording from tinyML EMEA Benchmarking Panel June 2023 here.
You have more ideas or would like to join our working group please reach out to Rosina email@example.com.