sqlexercise¶
Generate SQL assignments based on specified SQL errors and difficulty levels.
Submodules¶
Attributes¶
Mapping of SQL errors to their requirements. |
Exceptions¶
Custom exception for errors during exercise generation. |
Classes¶
Difficulty levels for SQL assignments. |
|
A full SQL assignment consisting of a dataset and exercises. |
|
A SQL dataset related to a specific domain, including schema creation and data insertion commands. |
|
A SQL exercise consisting of a title, request, and solutions. |
|
Base class for schema-related constraints. |
|
Base class for query-related constraints. |
|
Requirements for generating an assignment likely to trigger a specific error |
Functions¶
|
Select and return a random domain from predefined list. |
|
Generate SQL assignments based on the given SQL errors and their corresponding difficulty levels. |
Package Contents¶
- class sqlexercise.DifficultyLevel¶
Bases:
enum.IntEnumDifficulty levels for SQL assignments.
- EASY = 1¶
Minimal cognitive load, the assignments contains only elements related to triggering the error
- MEDIUM = 2¶
Moderate cognitive load, the assignments contains some elements not related to triggering the error
- HARD = 3¶
High cognitive load, the assignments contains elements not related to triggering the error and may require complex reasoning
- sqlexercise.random_domain(language)¶
Select and return a random domain from predefined list.
- Parameters:
language (str)
- Return type:
str
- class sqlexercise.Assignment¶
A full SQL assignment consisting of a dataset and exercises.
- dataset: sqlexercise.assignments.dataset.Dataset¶
The dataset associated with the assignment.
- exercises: list[sqlexercise.assignments.exercise.Exercise]¶
The exercises included in the assignment.
- class sqlexercise.Dataset¶
A SQL dataset related to a specific domain, including schema creation and data insertion commands.
- create_commands: list[str]¶
SQL commands to create the database schema.
- insert_commands: list[str]¶
SQL commands to insert data into the database.
- domain: str¶
The domain associated with the dataset.
- _catalog_cache: sqlscope.Catalog | None = None¶
Cached SQLScope Catalog for the dataset.
- _catalog_cache_commands_hash: int | None = None¶
Hash of the CREATE TABLE commands used to build the cached Catalog.
- property catalog: sqlscope.Catalog¶
Build and return a SQLScope Catalog from the dataset’s SQL commands. The result is cached for handling multiple accesses efficiently. Cache is properly invalidated if the CREATE TABLE commands change.
- Return type:
sqlscope.Catalog
- to_sql_no_context()¶
Generate the SQL commands to create and populate the dataset without schema context.
- Return type:
str
- to_sql(schema)¶
Generate the SQL commands to create and populate the dataset within the specified schema.
- Parameters:
schema (str)
- Return type:
str
- static from_sql(sql_str, sql_dialect)¶
Create a Dataset instance from a raw SQL string containing CREATE TABLE and INSERT INTO commands.
- Parameters:
sql_str (str)
sql_dialect (str)
- Return type:
- static generate(domain, sql_dialect, constraints, extra_details=[], *, db_host, db_port, db_user, db_password, language, max_attempts=5, on_attempt_start=lambda : ...)¶
Generate a SQL dataset based on the specified parameters.
- Parameters:
domain (str)
sql_dialect (str)
constraints (collections.abc.Sequence[sqlexercise.constraints.SchemaConstraint])
extra_details (list[str])
db_host (str)
db_port (int)
db_user (str)
db_password (str)
language (str)
max_attempts (int)
on_attempt_start (Callable[[], None])
- Return type:
- class sqlexercise.Exercise¶
A SQL exercise consisting of a title, request, and solutions.
- title: str¶
The title of the exercise.
- request: str¶
The natural language request or question for the exercise.
- solutions: list[sqlscope.Query]¶
The list of SQL query solutions for the exercise.
- difficulty: sqlexercise.difficulty_level.DifficultyLevel¶
The difficulty level of the exercise.
- error: sqlerrors.SqlErrors¶
The SQL error type associated with the exercise.
- static generate(error, difficulty, constraints, *, db_host, db_port, db_user, db_password, extra_details, dataset, title, sql_dialect, language, max_attempts=3, on_attempt_start=lambda : ...)¶
Generate a SQL exercise based on the specified parameters.
- Parameters:
error (sqlerrors.SqlErrors)
difficulty (sqlexercise.difficulty_level.DifficultyLevel)
constraints (list[sqlexercise.constraints.QueryConstraint])
db_host (str)
db_port (int)
db_user (str)
db_password (str)
extra_details (str)
dataset (sqlexercise.assignments.dataset.Dataset)
title (str)
sql_dialect (str)
language (str)
max_attempts (int)
on_attempt_start (Callable[[], None])
- Return type:
- class sqlexercise.SchemaConstraint¶
Bases:
sqlexercise.constraints.base.BaseConstraintBase class for schema-related constraints.
- abstractmethod validate(catalog, tables_sql, values_sql)¶
Validate if the given table creation and insertion statements satisfy the constraint.
- Args:
catalog (Catalog): The catalog representing the database schema. tables_sql (list[exp.Create]): List of CREATE TABLE expressions. values_sql (list[exp.Insert]): List of INSERT INTO expressions.
- Raises:
ConstraintValidationError: If the schema does not satisfy the constraint.
- Parameters:
catalog (sqlscope.Catalog)
tables_sql (list[sqlglot.exp.Create])
values_sql (list[sqlglot.exp.Insert])
- Return type:
None
- abstractmethod merge(other)¶
Merges this constraint with another constraint of the same type.
- Parameters:
other (SchemaConstraint)
- Return type:
- class sqlexercise.QueryConstraint¶
Bases:
sqlexercise.constraints.base.BaseConstraintBase class for query-related constraints.
- abstractmethod validate(query)¶
Validate if the given SQL query satisfies the constraint.
- Args:
query (Query): The SQL query to validate.
- Raises:
ConstraintValidationError: If the query does not satisfy the constraint.
- Parameters:
query (sqlscope.Query)
- Return type:
None
- class sqlexercise.SqlErrorRequirements(language)¶
Bases:
abc.ABCRequirements for generating an assignment likely to trigger a specific error
- Parameters:
language (str)
- dataset_constraints(difficulty)¶
Constraints the dataset must satisfy to likely trigger the error.
- Parameters:
difficulty (sqlexercise.difficulty_level.DifficultyLevel)
- Return type:
- exercise_constraints(difficulty)¶
Constraints the exercise must satisfy to likely trigger the error.
- Parameters:
difficulty (sqlexercise.difficulty_level.DifficultyLevel)
- Return type:
- exercise_extra_details()¶
Additional details or instructions for the exercise.
- Return type:
sqlexercise.translatable_text.TranslatableText
- dataset_extra_details()¶
Additional details or instructions for the dataset.
- Return type:
sqlexercise.translatable_text.TranslatableText
- sqlexercise.ERROR_REQUIREMENTS_MAP: dict[sqlerrors.SqlErrors, type[base.SqlErrorRequirements]]¶
Mapping of SQL errors to their requirements.
- exception sqlexercise.ExerciseGenerationError¶
Bases:
ExceptionCustom exception for errors during exercise generation.
- sqlexercise.generate_assignment(errors, db_host, db_port, db_user, db_password, sql_dialect='postgres', *, language='en', domain=None, dataset_str=None, shuffle_exercises=False, naming_func=lambda error, difficulty: ..., max_dataset_attempts=3, max_exercise_attempts=3, max_unique_attempts=3, max_workers=None, on_domain_selection=lambda domain: ..., on_dataset_generation_progress=lambda n, m: ..., on_exercise_generation_progress=lambda n, m: ..., on_dataset_generation_success=lambda : ..., on_exercise_generation_success=lambda e, d: ..., on_exercise_generation_failure=lambda e, d: ...)¶
Generate SQL assignments based on the given SQL errors and their corresponding difficulty levels.
Exercises are returned in the same order as the input errors.
Logging happens as soon as possible (during generation), and each message uses the exercise title as its id.
Deduplication is global across all generated exercises (thread-safe).
- Args:
errors (list[tuple[SqlErrors, DifficultyLevel]]): A list of (error, difficulty) pairs. sql_dialect (str): The SQL dialect to use for generating the dataset and exercises (e.g., ‘postgres’, ‘mysql’). domain (str | None): The domain for the assignments. If None, a random domain will be selected. language (str): The language for the assignment generation (e.g., ‘en’ for English). dataset_str (str | None): Optional SQL string to use as the dataset. If provided, it will be used instead of generating a new dataset. shuffle_exercises (bool): Whether to shuffle exercises to prevent ordering bias (shuffles input order). naming_func (Callable[[SqlErrors, DifficultyLevel], str]): Generates exercise titles. max_dataset_attempts (int): Maximum retries for generating a valid dataset before skipping. max_exercise_attempts (int): Maximum retries for generating a valid exercise before skipping. max_unique_attempts (int): Maximum retries to avoid duplicate solutions per (error, difficulty). max_workers (int | None): Thread pool size. If None, uses ThreadPoolExecutor default. on_domain_selection (Callable[[str], None]): Callback for when a domain is selected. on_dataset_generation_progress (Callable[[int, int], None]): Callback for dataset generation progress (current attempt, max attempts). on_exercise_generation_progress (Callable[[int, int], None]): Callback for exercise generation progress (current attempt, max attempts). on_dataset_generation_success (Callable[[], None]): Callback for successful dataset generation. on_exercise_generation_success (Callable[[SqlErrors, DifficultyLevel], None]): Callback for successful exercise generation. on_exercise_generation_failure (Callable[[SqlErrors, DifficultyLevel], None]): Callback for failed exercise generation.
- Returns:
Assignment: The generated assignment (stable order).
- Parameters:
errors (list[tuple[sqlerrors.SqlErrors, difficulty_level.DifficultyLevel]])
db_host (str)
db_port (int)
db_user (str)
db_password (str)
sql_dialect (str)
language (str)
domain (str | None)
dataset_str (str | None)
shuffle_exercises (bool)
naming_func (Callable[[sqlerrors.SqlErrors, difficulty_level.DifficultyLevel], str])
max_dataset_attempts (int)
max_exercise_attempts (int)
max_unique_attempts (int)
max_workers (int | None)
on_domain_selection (Callable[[str], None])
on_dataset_generation_progress (Callable[[int, int], None])
on_exercise_generation_progress (Callable[[int, int], None])
on_dataset_generation_success (Callable[[], None])
on_exercise_generation_success (Callable[[sqlerrors.SqlErrors, difficulty_level.DifficultyLevel], None])
on_exercise_generation_failure (Callable[[sqlerrors.SqlErrors, difficulty_level.DifficultyLevel], None])
- Return type: