From 797f623180787429e7e774e0c1442ece18b49ba3 Mon Sep 17 00:00:00 2001 From: Jason Liu Date: Sat, 25 Nov 2023 20:02:15 -0500 Subject: [PATCH] add evals --- docs/contributing.md | 4 ++++ tests/openai/evals/readme.md | 9 +++++++++ 2 files changed, 13 insertions(+) create mode 100644 tests/openai/evals/readme.md diff --git a/docs/contributing.md b/docs/contributing.md index 66ca952..51c0201 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -1,5 +1,9 @@ We would love for you to contribute to `Instructor`. +## [Evals](https://github.com/jxnl/instructor/tree/main/tests/openai/evals) + +We invite you to contribute evals in pytest as a way to monitor the quality of the openai models and the instructor library. To get started check out the [jxnl/instructor/tests/evals](https://github.com/jxnl/instructor/tree/main/tests/openai/evals) and contribute your own evals in the form of pytest tests. These evals will be run once a week and the results will be posted. + ## Issues If you find a bug, please file an issue on [our issue tracker on GitHub](https://github.com/jxnl/instructor/issues). diff --git a/tests/openai/evals/readme.md b/tests/openai/evals/readme.md new file mode 100644 index 0000000..7efaf19 --- /dev/null +++ b/tests/openai/evals/readme.md @@ -0,0 +1,9 @@ +# How to Contribute: Writing and Running Evaluation Tests + +We welcome contributors to expand our suite of evaluation tests for data extraction. This guide provides instructions on creating tests with `pytest`, `pydantic`, and other tools, focusing on broad coverage and failure modalities understanding. + +## Define Test Scenarios + +Identify data extraction scenarios relevant to you. Create test cases with inputs and expected outputs. + +Reference the `test_extract_users.py` which contains a test case for extracting users, using all models and all modes. The test case is parameterized with the model and mode, and the test function is parameterized with the input and expected output.