Skip to content

Builders for engines and readers #429

@reckart

Description

@reckart

Is your feature request related to a problem? Please describe.
We have the createEngineDescription and friends methods in uimaFIT. However, their parameters can be a bit confusing. For simple cases, we have a class and then the parameter/value combinations as pairs. However, if we want to add in a type system, type priorities or other stuff, it either becomes a bit fragile to not accidentally intermix those with the parameters or it is plain to possible because no createEngineDescription signature with the respective item exists.

Describe the solution you'd like
It would be nice to have a builder which would allow stuff like this:

var engineDescription = AnalysisEngineDescription.builder(MyAnalsisEngine.class) //
    .withTypeSystem(TypeSystemDescriptionFactory.createTypeSystemDescription()) // can probably be omitted in most cases
    .withParameter(MyAnalsisEngine.PARAM_BLAH, "blub") // single parameter
    .withParameters( // multiples as pairs because it is convenient to not have to repeat "withParameter" all the time
         MyAnalsisEngine.PARAM_FOO, "foo", //
         MyAnalsisEngine.PARAM_BAR, "bar")
    .withTypePriorities(...) //
    .build();

Describe alternatives you've considered
Instead of a normal builder pattern, a customizer pattern could also be used. That might make working with nested elements in the description more convenient. E.g.

var engineDescription = AnalysisEngineDescription.builder(MyAnalsisEngine.class)
    .metadata(md -> md
        .name("My Analysis Engine")
        .vendor("ACME")
        .typeSystem(TypeSystemDescriptionFactory.createTypeSystemDescription()))
    .parameters(params -> params
        .set(MyAnalsisEngine.PARAM_FOO, "foo")
        .set(MyAnalsisEngine.PARAM_BAR, "bar")))
    .build();

Additional context
Important: the new approach should not auto-scan for type system descriptions or similar metadata. Scanning can be slow in certain environments and doing that for every analysis engine etc. is not necesssary. If a CAS needs to be created with a scanned type system, CasFactory.createCas() should be used. It is sufficient if the CAS knows the type system. It is should not be necessary for each and every component to know it (unless you build a pipeline from a bunch of components that each come with their own local partial type system which then needs to be merged into the pipeline system).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions