[ENH] Simplified Publish API with Automatic Type Recognition by Omswastik-11 · Pull Request #1554 · openml/openml-python

Omswastik-11 · 2025-12-24T10:27:02Z

initially

from openml_sklearn.extension import SklearnExtension
from sklearn.neighbors import KNeighborsClassifier
clf = KNeighborsClassifier(n_neighbors=3)
extension = SklearnExtension()# User instantiates the extension object
knn_flow = extension.model_to_flow(clf) # User manually converts the model (estimator instance) to an OpenMLFlow object
knn_flow.publish()

API

from sklearn.neighbors import KNeighborsClassifier
import openml_sklearn  # Register the extension
import openml

clf = KNeighborsClassifier(n_neighbors=3)

openml.publish(clf)

examples/Basics/simple_flows_and_runs_tutorial.py

openml/__init__.py

fkiraly

I get this is a draft still, some early comments.

works for flows only, I would recommend to try for at least two different object types to see the dispatching challenge there.
do the extension checking inside publish and not in the usage example

Omswastik-11 · 2025-12-25T08:07:12Z

Thanks @fkiraly !!
I checked on flow , datset , task . it is working correctly but in run it is getting some server side issues.

Task 1 failed: https://test.openml.org/api/v1/xml/data/features/1 returned code 274: No features found. Additionally, dataset processed with error - None

jgyasu · 2025-12-31T10:11:19Z

The PR description is not entirely correct. This is how the interface looks currently:

from openml_sklearn.extension import SklearnExtension
from sklearn.neighbors import KNeighborsClassifier
clf = KNeighborsClassifier(n_neighbors=3)
extension = SklearnExtension()# User instantiates the extension object
knn_flow = extension.model_to_flow(clf) # User manually converts the model (estimator instance) to an OpenMLFlow object
knn_flow.publish()

But I like the idea of a unified publish. I am currently working on a design document for refactoring Extension and this design coincides mine as well, which is a good thing.

Omswastik-11 · 2025-12-31T13:11:48Z

The PR description is not entirely correct. This is how the interface looks currently:
from openml_sklearn.extension import SklearnExtension
from sklearn.neighbors import KNeighborsClassifier
clf = KNeighborsClassifier(n_neighbors=3)
extension = SklearnExtension()# User instantiates the extension object
knn_flow = extension.model_to_flow(clf) # User manually converts the model (estimator instance) to an OpenMLFlow object
knn_flow.publish()
But I like the idea of a unified publish. I am currently working on a design document for refactoring Extension and this design coincides mine as well, which is a good thing.

Thanks for the correction I used the syntax example used in example script . this unified publish was Franz's idea . https://github.com/gc-os-ai/openml-project-dev/issues/8

codecov-commenter · 2026-01-06T12:44:04Z

Codecov Report

❌ Patch coverage is 36.84211% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.63%. Comparing base (25ba6f8) to head (605daad).

Files with missing lines	Patch %	Lines
openml/publishing.py	33.33%	12 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1554      +/-   ##
==========================================
- Coverage   52.70%   52.63%   -0.07%     
==========================================
  Files          37       38       +1     
  Lines        4385     4404      +19     
==========================================
+ Hits         2311     2318       +7     
- Misses       2074     2086      +12

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

openml/__init__.py

jgyasu · 2026-01-13T08:57:36Z

I have added some comments. I also feel we should not populate __init__.py with these functions, we can have them in a seperate file and use __init__.py only for imports.

for more information, see https://pre-commit.ci

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

openml/publish.py

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/test_openml/test_openml.py

openml/__init__.py

openml/publish.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

for more information, see https://pre-commit.ci

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

openml/publish.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/test_openml/test_openml.py

openml/publishing.py

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

openml/init.py:33

openml.publish(...) is used by the new tests/examples, but the package top-level does not expose a publish attribute (only the publishing submodule is imported). This will raise AttributeError for openml.publish(...). Re-export the function from openml.publishing (and add it to __all__), or update the call sites to use openml.publishing.publish(...) consistently.

from . import (
    _api_calls,
    config,
    datasets,
    evaluations,
    exceptions,
    extensions,
    flows,
    publishing,
    runs,
    setups,
    study,
    tasks,

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

openml/publishing.py

tests/test_openml/test_openml.py

Co-authored-by: Armaghan Shakir <raoarmaghanshakir040@gmail.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-11T09:34:10Z

examples/Basics/simple_flows_and_runs_tutorial.py

 # ## Upload the machine learning experiments to OpenML
-# First, create a fow and fill it with metadata about the machine learning model.
+#
+# ### Option A: Automatic publishing (simplified)
+# The publish function automatically detects the model type and creates the flow:

 # %%
-knn_flow = openml.flows.OpenMLFlow(
-    # Metadata
-    model=clf,  # or None, if you do not want to upload the model object.
-    name="CustomKNeighborsClassifier",
-    description="A custom KNeighborsClassifier flow for OpenML.",
-    external_version=f"{sklearn.__version__}",
-    language="English",
-    tags=["openml_tutorial_knn"],
-    dependencies=f"{sklearn.__version__}",
-    # Hyperparameters
-    parameters={k: str(v) for k, v in knn_parameters.items()},
-    parameters_meta_info={
-        "n_neighbors": {"description": "number of neighbors to use", "data_type": "int"}
-    },
-    # If you have a pipeline with subcomponents, such as preprocessing, add them here.
-    components={},
-)
-knn_flow.publish()
-print(f"knn_flow was published with the ID {knn_flow.flow_id}")
+knn_flow = openml.publish(clf, tags=["openml_tutorial_knn"])
+print(f"Flow was auto-published with ID {knn_flow.flow_id}")


This tutorial now uses openml.publish(clf, ...) which requires an installed/registered scikit-learn extension (typically openml-sklearn). Since the script doesn’t import openml_sklearn or mention the dependency, users running the example without that extra will get a ValueError. Consider adding a short note (or an explicit import openml_sklearn # noqa: F401) near the top so the example is self-contained and the requirement is clear.

Copilot · 2026-03-11T09:34:10Z

openml/publishing.py

+        if tags and hasattr(obj, "tags"):
+            existing = obj.tags or []
+            if all(isinstance(tag, str) for tag in existing):
+                obj.tags = list(dict.fromkeys([*existing, *tags]))
+        if name is not None and hasattr(obj, "name"):
+            obj.name = name
+        return obj.publish()


tags is typed as Sequence[str], but at runtime passing a single string (e.g., tags="foo") will be treated as an iterable of characters and will silently add "f", "o", "o". It would be safer to validate that tags is not a str (and that all provided tags are strings) and raise a clear TypeError/ValueError when the input is invalid.

PGijsbers

I have to give this some more consideration. Here are my thoughts so far.

Currently, all OpenML entities that are publishable have a publish() method. In that sense, the API is already unified. In my view, the benefit of this is therefor mainly that it can extend to new object types, such as estimators for which an extension is registered that can serialize it to a Flow. So it seems there are now two alternatives:

dataset = OpenMLDataset(name="foo", tags=["bar"], ...)
dataset.publish()

or

dataset = OpenMLDataset(...)
openml.publish(dataset, name="foo", tags=["bar"])
# of course, name and tags could also have been provided during initialisation of the OpenMLDataset

So for anything that is already an OpenML object, I do not really see the benefit. It just introduces two different ways to do things, which I would generally be against. (I assume here the intention is for the publish method to remain on the OpenML objects as well.)

For estimators that are to be converted to flows, this is of course significantly shorter as shown in the original proposal. However, I think it would also be worth considering an alternative. Consider that perhaps instead an OpenMLFlow could be initialised with an arbitrary object which would be attempted to be resolved with extensions. Then it could also provide a similarly smooth experience:

estimator = sklearn.tree.DecisionTreeClassifier()
flow = OpenMLFlow(estimator, name="foo", tags=["bar"])
flow.publish()

Sure, it introduces an extra line of code, but it does make it explicit to the user what kind of OpenML object they are publishing.

I am trying to think of other categories of objects that would be parsed into OpenML objects that would also benefit from this general publish function, but I can't really think of any. In most cases, I think it would be far more useful to have one dedicated function to create the object and thus communicate the metadata schema. E.g., a dataset can have a name, description, author, and so on, and I do not think this is something that would translate well into a general publish call (thinking of e.g., openml.publish(dataframe, name="...", description="...", ...) but where the type hints cannot provide information as so what metadata is valid).

In any case, the coupling of object creation with publication to the platform is problematic in the case where users do not have an internet connection. This can be the case where e.g., a user prepopulates their cache when they have a connection and then executes experiments in an offline setting (most commonly some compute server setups, but potentially also something like a regular outage). While we do provide some utility functions that do this (like run_model_on_task, though there too we made sure it could run offline with the right arguments), I am hesitant to introduce that as the main way to create/share objects.

improve publish api for users

0f21640

Omswastik-11 changed the title ~~[ENH] improve publish api for users~~ [ENH] Simplified Publish API with Automatic Type Recognition Dec 24, 2025

fkiraly reviewed Dec 24, 2025

View reviewed changes

examples/Basics/simple_flows_and_runs_tutorial.py Outdated Show resolved Hide resolved

fkiraly reviewed Dec 24, 2025

View reviewed changes

openml/__init__.py Outdated Show resolved Hide resolved

fkiraly requested changes Dec 24, 2025

View reviewed changes

Omswastik-11 added 3 commits December 25, 2025 13:20

improve doc-string

3b1d961

update __init__.py

3dfe34a

update examples

db36778

Omswastik-11 requested a review from fkiraly December 25, 2025 08:07

Omswastik-11 added 2 commits December 31, 2025 18:41

Merge branch 'main' into prototype-publish

8c600cb

Merge branch 'main' into prototype-publish

79bf2c2

Omswastik-11 marked this pull request as ready for review January 1, 2026 11:07

Merge branch 'main' into prototype-publish

8837954

Merge branch 'main' into prototype-publish

c904b1a

jgyasu suggested changes Jan 13, 2026

View reviewed changes

openml/__init__.py Outdated Show resolved Hide resolved

openml/__init__.py Outdated Show resolved Hide resolved

move publish func to a separate file

7242ee3

Omswastik-11 requested a review from jgyasu January 13, 2026 15:00

Omswastik-11 and others added 3 commits January 15, 2026 14:48

Merge branch 'main' into prototype-publish

6f141a5

[pre-commit.ci] auto fixes from pre-commit.com hooks

bdfa2cb

for more information, see https://pre-commit.ci

Merge branch 'main' into prototype-publish

88ac5fd

geetu040 assigned Omswastik-11 Jan 19, 2026

Omswastik-11 added 2 commits February 4, 2026 11:12

Merge branch 'main' into prototype-publish

76a4d3f

Merge branch 'main' into prototype-publish

54d92e2

Copilot AI review requested due to automatic review settings February 26, 2026 10:49

Copilot started reviewing on behalf of Omswastik-11 February 26, 2026 10:50 View session

Copilot AI reviewed Feb 27, 2026

View reviewed changes

openml/publish.py Outdated Show resolved Hide resolved

openml/publish.py Outdated Show resolved Hide resolved

openml/publish.py Outdated Show resolved Hide resolved

Omswastik-11 added 2 commits February 27, 2026 15:13

modify tutorial

a783495

update

5ff9565

Copilot AI review requested due to automatic review settings February 27, 2026 09:50

Copilot started reviewing on behalf of Omswastik-11 February 27, 2026 09:50 View session

Omswastik-11 requested a review from geetu040 February 27, 2026 09:51

Copilot AI reviewed Feb 27, 2026

View reviewed changes

tests/test_openml/test_openml.py Outdated Show resolved Hide resolved

openml/__init__.py Show resolved Hide resolved

openml/publish.py Outdated Show resolved Hide resolved

improve error raising

fc2f9c7

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 2, 2026 07:02

Copilot started reviewing on behalf of Omswastik-11 March 2, 2026 07:02 View session

[pre-commit.ci] auto fixes from pre-commit.com hooks

b24910a

for more information, see https://pre-commit.ci

Copilot AI reviewed Mar 2, 2026

View reviewed changes

openml/publish.py Outdated Show resolved Hide resolved

Update openml/publish.py

49a0e65

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 2, 2026 07:07

Copilot started reviewing on behalf of Omswastik-11 March 2, 2026 07:08 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

tests/test_openml/test_openml.py Outdated Show resolved Hide resolved

openml/publishing.py Show resolved Hide resolved

openml/publishing.py Show resolved Hide resolved

Omswastik-11 added 2 commits March 2, 2026 12:56

improve error message

8e47bdc

modified tests

06d377a

Copilot AI review requested due to automatic review settings March 2, 2026 07:38

Copilot started reviewing on behalf of Omswastik-11 March 2, 2026 07:39 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

openml/publishing.py Show resolved Hide resolved

openml/publishing.py Outdated Show resolved Hide resolved

openml/publishing.py Outdated Show resolved Hide resolved

modified tests

691f36c

geetu040 suggested changes Mar 10, 2026

View reviewed changes

Merge branch 'main' into prototype-publish

1c2bcd6

Copilot AI review requested due to automatic review settings March 11, 2026 09:30

Copilot started reviewing on behalf of Omswastik-11 March 11, 2026 09:31 View session

simplify openml/publishing.py

7d7a242

Co-authored-by: Armaghan Shakir <raoarmaghanshakir040@gmail.com>

Copilot AI reviewed Mar 11, 2026

View reviewed changes

updated the tests and publishin toavoid duplicates

605daad

PGijsbers reviewed Mar 13, 2026

View reviewed changes

Uh oh!

Conversation

Omswastik-11 commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API

Uh oh!

Uh oh!

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

Omswastik-11 commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jgyasu commented Dec 31, 2025

Uh oh!

Omswastik-11 commented Dec 31, 2025

Uh oh!

codecov-commenter commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

jgyasu commented Jan 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

PGijsbers left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Omswastik-11 commented Dec 24, 2025 •

edited

Loading

Omswastik-11 commented Dec 25, 2025 •

edited

Loading

codecov-commenter commented Jan 6, 2026 •

edited

Loading