feat: Enhance architecture and roadmap documentation with material extensibility and persistence layer details
This commit is contained in:
@@ -37,6 +37,8 @@ Current product definition:
|
|||||||
- The project model is composable: a minimal core plus optional domain-specific extensions.
|
- The project model is composable: a minimal core plus optional domain-specific extensions.
|
||||||
- Domain-specific detail (for example knitting patterns/yarns, 3D print profiles/STLs, electronics BoM data) should be representable without being mandatory for all instances.
|
- Domain-specific detail (for example knitting patterns/yarns, 3D print profiles/STLs, electronics BoM data) should be representable without being mandatory for all instances.
|
||||||
- First-party FeDIY focuses on a stable extension mechanism rather than implementing every niche schema directly.
|
- First-party FeDIY focuses on a stable extension mechanism rather than implementing every niche schema directly.
|
||||||
|
- Materials are also an extensible entity: the core material record captures display name, quantity, and unit; domain-specific attributes (yarn weight, fibre content, filament diameter/material, wood species/grade, electronics component value/package) are carried in extension payloads on the material entry, using the same extension mechanism as project-level extensions.
|
||||||
|
- A federated material catalog is a long-term goal: community-defined material types and shared taxonomy entries could be federated as ActivityPub objects, allowing instances to reference a common vocabulary without requiring central authority.
|
||||||
|
|
||||||
## Client and Front-End Strategy
|
## Client and Front-End Strategy
|
||||||
|
|
||||||
@@ -84,6 +86,45 @@ Bundled web UI:
|
|||||||
- Preserve source metadata for remote content and actor provenance.
|
- Preserve source metadata for remote content and actor provenance.
|
||||||
- Track object lifecycle states to support idempotent federation processing.
|
- Track object lifecycle states to support idempotent federation processing.
|
||||||
- Persist project data as core fields plus extension payloads so instances can tailor domain detail without fragmenting the base model.
|
- Persist project data as core fields plus extension payloads so instances can tailor domain detail without fragmenting the base model.
|
||||||
|
- Material entries within a project carry the same extension payload structure as the project itself; domain-specific material attributes are co-located with the material entry rather than encoded in project-level fields.
|
||||||
|
|
||||||
|
## Persistence Layer Architecture
|
||||||
|
|
||||||
|
### Database
|
||||||
|
|
||||||
|
PostgreSQL is the primary persistence target.
|
||||||
|
|
||||||
|
Reasons:
|
||||||
|
|
||||||
|
- **JSONB**: ActivityPub objects and extension payloads are stored as structured JSON with PostgreSQL's JSONB operators. Extension payloads can be queried, indexed, and validated without an external document store.
|
||||||
|
- **Native full-text search**: PostgreSQL's built-in FTS with `tsvector`/`tsquery` eliminates an external search service for Phase 1. Language-specific configurations (stemming, stop words) are available per-column.
|
||||||
|
- **Transactional consistency under federation load**: federation fan-in (many incoming AP activities from many peers) involves concurrent writes. PostgreSQL's MVCC concurrency model and row-level locking handle this safely. SQLite's single-writer model would be a bottleneck under the same load.
|
||||||
|
- **Broad managed hosting support**: PostgreSQL is available on every major cloud platform and hosting provider with zero operational effort, lowering the barrier for instance operators.
|
||||||
|
|
||||||
|
SQLite is **not in scope for Phase 1** but is explicitly not ruled out as a future lightweight self-hosting option (see Repository Abstraction below).
|
||||||
|
|
||||||
|
### Repository Abstraction
|
||||||
|
|
||||||
|
No business logic or domain service code queries the database directly. All persistence operations go through a repository interface layer:
|
||||||
|
|
||||||
|
- Each domain aggregate (project, account, actor, moderation record, etc.) has a corresponding repository interface defined as a trait.
|
||||||
|
- The domain service layer depends only on those traits, not on any database library types.
|
||||||
|
- The PostgreSQL implementation of each trait is the only first-party implementation.
|
||||||
|
- A SQLite implementation could be added in a future phase by implementing the same traits with SQLite-dialect queries — zero changes to domain logic or API handlers would be required.
|
||||||
|
- The repository layer is also the natural seam for test doubles: domain logic tests can use an in-memory implementation of the traits without a running database.
|
||||||
|
|
||||||
|
### Query Library
|
||||||
|
|
||||||
|
The query library choice is deferred to the implementation ADR, but the constraints are:
|
||||||
|
|
||||||
|
- Must support async execution.
|
||||||
|
- Must support both PostgreSQL and SQLite dialects (to keep the SQLite future option open).
|
||||||
|
- Compile-time query checking is strongly preferred to catch SQL errors before runtime.
|
||||||
|
- `sqlx` satisfies all three constraints and is the expected choice, but the decision is recorded in the ADR.
|
||||||
|
|
||||||
|
### Migration Strategy
|
||||||
|
|
||||||
|
See Q28. Database migrations are run as part of application startup (or a separate migrate subcommand) and are versioned, idempotent, and checked into source control alongside the schema they produce.
|
||||||
|
|
||||||
## Federation Strategy
|
## Federation Strategy
|
||||||
|
|
||||||
@@ -92,6 +133,41 @@ Bundled web UI:
|
|||||||
- Use replay-safe request validation and deterministic retry behavior.
|
- Use replay-safe request validation and deterministic retry behavior.
|
||||||
- Maintain an interop test matrix for protocol behaviors.
|
- Maintain an interop test matrix for protocol behaviors.
|
||||||
|
|
||||||
|
## Content Integrity
|
||||||
|
|
||||||
|
Certain categories of content are prohibited on any FeDIY instance regardless of operator configuration. These are not moderation policy — they are non-negotiable constraints enforced at the software level. The guiding principle is **consent**: the prohibited categories share the property that no legitimate consent to the content's creation or publication can exist.
|
||||||
|
|
||||||
|
### Hardcoded Prohibitions
|
||||||
|
|
||||||
|
| Category | Rationale | Enforcement approach |
|
||||||
|
|---|---|---|
|
||||||
|
| Child Sexual Abuse Material (CSAM) | Minors cannot consent to sexual content; production is abuse | Perceptual hash-matching against NCMEC hash database on every upload |
|
||||||
|
| Non-Consensual Intimate Imagery (NCII) | Subject has not consented to distribution | Hash-matching against StopNCII or equivalent database on every upload |
|
||||||
|
| Doxxing | Individual has not consented to publication of their private identifying information | Upload-time pattern detection (phone, address, government ID formats) as a signal; mandatory human-review flagging pipeline; rapid takedown tooling |
|
||||||
|
|
||||||
|
### What Is Not Hardcoded
|
||||||
|
|
||||||
|
Weapons, violence, and dual-use content are **not** hardcoded prohibitions. Legitimate DIY projects — fireworks, blacksmithing, blade-smithing, pyrotechnics, casting — can be indistinguishable from prohibited content at the software level. This category is handled by operator content policy and community moderation tools, not by the platform software.
|
||||||
|
|
||||||
|
### Hash-Matching Infrastructure
|
||||||
|
|
||||||
|
- All media uploaded to a FeDIY instance is processed through the hash-matching pipeline **before** storage is confirmed. A match results in rejection of the upload and triggers the reporting workflow.
|
||||||
|
- Hash databases are not bundled with the software. Operators must configure the integration (NCMEC PhotoDNA, Microsoft CSAM hash API, or equivalent) before media upload is enabled. The application refuses to enable media upload without a configured hash-matching endpoint.
|
||||||
|
- Hash-matching is performed locally on the server; media content is never transmitted to a third-party hash service. Only the computed hash is compared.
|
||||||
|
- FeDIY provides clear integration documentation and a test mode for operators to verify their configuration before going live.
|
||||||
|
|
||||||
|
### Doxxing Detection
|
||||||
|
|
||||||
|
- Upload-time and submission-time scanning checks text content for patterns consistent with personal identifying information: phone number formats, postal address patterns, government ID number patterns (country-configurable), and combinations that together identify an individual.
|
||||||
|
- Pattern matching is a signal, not a block: false positives (a project step referencing a phone socket) must not prevent legitimate content. Matched content is flagged for moderator review, not auto-rejected.
|
||||||
|
- All instances must have at least one active moderator account to receive flagged content alerts before registrations are opened.
|
||||||
|
|
||||||
|
### Reporting and Legal Obligations
|
||||||
|
|
||||||
|
- When a CSAM match is confirmed by a moderator, the operator is required to report to the relevant national authority (NCMEC CyberTipline in the US, IWF in the UK, etc.). FeDIY provides a reporting workflow and documentation; the legal obligation rests with the instance operator as the data controller and platform host.
|
||||||
|
- NCII confirmed matches follow the StopNCII/similar removal workflow; the operator notifies the subject where possible.
|
||||||
|
- The platform stores a minimal, anonymised record of confirmed violations and reports for the operator's legal compliance purposes.
|
||||||
|
|
||||||
## Moderation and Safety Strategy
|
## Moderation and Safety Strategy
|
||||||
|
|
||||||
### Instance and Moderator-Level Controls
|
### Instance and Moderator-Level Controls
|
||||||
|
|||||||
+30
-9
@@ -13,6 +13,9 @@ Each question is tagged with the phase it blocks or most affects: [P0], [P1], [P
|
|||||||
- Explicit project versioning is preferred and should be supported.
|
- Explicit project versioning is preferred and should be supported.
|
||||||
- A project is composable: FeDIY provides a minimal core model, and instances can tailor project detail via optional domain-specific extensions.
|
- A project is composable: FeDIY provides a minimal core model, and instances can tailor project detail via optional domain-specific extensions.
|
||||||
- FeDIY should not require first-party implementation of every domain detail up front; expert communities can define richer schemas over time.
|
- FeDIY should not require first-party implementation of every domain detail up front; expert communities can define richer schemas over time.
|
||||||
|
- Materials are also extensible entities. The core material record (name, quantity, unit) can carry domain-specific extension payloads using the same mechanism as project extensions. Community-defined material type schemas (e.g. yarn, filament, PCB component) can be layered on without modifying the core model.
|
||||||
|
- **Database: PostgreSQL is the primary persistence target.** JSONB, native full-text search, transactional consistency under concurrent federation fan-in, and broad managed hosting support make it the clear long-term fit. The persistence layer is behind a repository abstraction (trait-based interfaces), which keeps business logic independent of the database driver and leaves SQLite viable as a future lightweight self-hosting option without requiring changes to domain logic. See [ADR TBD: Persistence Layer Architecture].
|
||||||
|
- **Baseline content prohibitions (hardcoded, not operator-configurable):** CSAM, doxxing, and non-consensual intimate imagery (NCII) are prohibited on any FeDIY instance regardless of operator policy. The guiding principle is **consent**: minors cannot consent to sexual content; individuals have not consented to having their private identifying information published; subjects of intimate imagery have not consented to its distribution. Enforcement is in-code as far as technically feasible (hash-matching for CSAM and NCII; upload-time pattern detection and mandatory human-review tooling for doxxing). Weapons, violence, and similar dual-use content are **not** hardcoded prohibitions — legitimate DIY projects (fireworks, blacksmithing, knife-making) are indistinguishable at the software level and are handled by operator content policy and community moderation.
|
||||||
|
|
||||||
## Upfront Clarification Plan (P0 -> Early P1)
|
## Upfront Clarification Plan (P0 -> Early P1)
|
||||||
|
|
||||||
@@ -26,9 +29,9 @@ The goal is to remove ambiguity before implementation while keeping scope realis
|
|||||||
|
|
||||||
1. Decision track B: Core-plus-extension contract
|
1. Decision track B: Core-plus-extension contract
|
||||||
|
|
||||||
- Resolve Q2, Q2a, Q2b, and Q2c as one coherent contract.
|
- Resolve Q2, Q2a, Q2b, Q2c, and Q2d as one coherent contract.
|
||||||
- Output: one ADR for core project fields and one ADR for extension payload shape/namespacing/discovery.
|
- Output: one ADR for core project and material fields, and one ADR for extension payload shape/namespacing/discovery (shared by both project and material extensions).
|
||||||
- Success signal: instances can add domain-specific detail (knitting, 3D print, electronics, etc.) without changing core semantics.
|
- Success signal: instances can add domain-specific detail at both the project level and the material entry level (knitting yarn specs, 3D filament profiles, electronics BoM entries, etc.) without changing core semantics.
|
||||||
|
|
||||||
1. Decision track C: Search/index policy for extensions
|
1. Decision track C: Search/index policy for extensions
|
||||||
|
|
||||||
@@ -57,6 +60,26 @@ The goal is to remove ambiguity before implementation while keeping scope realis
|
|||||||
|
|
||||||
- Free-text only, or linked to a shared taxonomy/catalog?
|
- Free-text only, or linked to a shared taxonomy/catalog?
|
||||||
- Do quantities and units need to be structured (for search/filtering), or is prose sufficient for MVP?
|
- Do quantities and units need to be structured (for search/filtering), or is prose sufficient for MVP?
|
||||||
|
- Are materials solely inline entries within a project, or are they also independent top-level entities that can be referenced by multiple projects?
|
||||||
|
- Does the core material record have a fixed set of fields (name, quantity, unit, optional note) with extension payloads carrying domain-specific attributes? Or is the entire material record opaque beyond a display name?
|
||||||
|
- How are domain-specific material extensions (yarn weight/fibre/colourway, filament diameter/material/temperature profile, wood species/grade/finish, electronics component value/tolerance/package) represented? Same extension payload shape as project extensions, or a separate material-type schema?
|
||||||
|
- At what phase should structured quantity/unit data be required? A unit normalisation scheme (SI base + common craft units) would enable cross-instance filtering and quantity scaling; is that P1 or later?
|
||||||
|
|
||||||
|
**Q2d [P1/P2]** What is the extension model for domain-specific material data?
|
||||||
|
|
||||||
|
- Does a material entry carry the same extension payload mechanism as a project (namespaced typed block, discoverable schema)?
|
||||||
|
- Who can define a material type schema — instance operators only, or any community contributor who publishes a schema at a well-known URL?
|
||||||
|
- Should FeDIY ship a small set of reference material type schemas (yarn, 3D filament, electronic component) as non-mandatory examples to establish the pattern?
|
||||||
|
- How does a client know which material type extension(s) a given entry carries, and how does it fall back gracefully when it does not support the type?
|
||||||
|
- What is the relationship between project-level extensions and material-level extensions? Can a domain extension define both a project schema and a material schema that are versioned together?
|
||||||
|
|
||||||
|
**Q2e [P2/P3]** What would a federated material catalog look like?
|
||||||
|
|
||||||
|
- Could shared material types (e.g. a yarn colourway catalog, a filament brand/profile registry) be published as ActivityPub objects and federated between instances?
|
||||||
|
- What ActivityPub object type would a material catalog entry use? A custom `fediy:MaterialType` in a FeDIY JSON-LD context, or an existing vocabulary term?
|
||||||
|
- Who is authoritative for a shared catalog entry — the originating instance, a designated community instance, or a multi-instance governance process?
|
||||||
|
- How does a material catalog entry interact with the right to erasure / RTBF? If a community-curated entry (not personal data) is federated, different deletion semantics apply than for user personal data.
|
||||||
|
- Is a federated catalog in scope before federation is live (Phase 2), or does it only make sense alongside AP federation?
|
||||||
|
|
||||||
**Q2a [P1]** How are required tools modelled?
|
**Q2a [P1]** How are required tools modelled?
|
||||||
|
|
||||||
@@ -256,10 +279,9 @@ The goal is to remove ambiguity before implementation while keeping scope realis
|
|||||||
|
|
||||||
## Persistence
|
## Persistence
|
||||||
|
|
||||||
**Q27 [P0/P1]** What is the database?
|
**Q27 [P0/P1]** ~~What is the database?~~ **RESOLVED — see Resolved Decisions.**
|
||||||
|
|
||||||
- PostgreSQL is the obvious choice given full-text search, JSONB for flexible AP object storage, and broad hosting support — is this confirmed?
|
Decision: PostgreSQL as primary target with a repository abstraction layer. SQLite is a possible future option (hobbyist self-hosting) enabled by the abstraction without changing business logic. Decision track: one ADR to cover the persistence layer architecture (database choice + repository pattern + query library selection).
|
||||||
- Is SQLite a supported option for lightweight self-hosting, or is that complexity not worth it?
|
|
||||||
|
|
||||||
**Q28 [P1]** What is the database migration strategy?
|
**Q28 [P1]** What is the database migration strategy?
|
||||||
|
|
||||||
@@ -336,10 +358,9 @@ The goal is to remove ambiguity before implementation while keeping scope realis
|
|||||||
|
|
||||||
## Content and Community Policy
|
## Content and Community Policy
|
||||||
|
|
||||||
**Q36 [P0]** Does the platform define baseline content guidelines beyond what moderation tooling enforces?
|
**Q36 [P0]** ~~Does the platform define baseline content guidelines beyond what moderation tooling enforces?~~ **RESOLVED — see Resolved Decisions.**
|
||||||
|
|
||||||
- What categories of content are prohibited regardless of instance policy?
|
Decision: CSAM, doxxing, and NCII are hardcoded prohibitions enforced in code as far as technically feasible. The guiding principle is consent. Weapons and dual-use DIY content are not hardcoded — handled by operator policy.
|
||||||
- CSAM must be prohibited and reported — is there a mechanism planned?
|
|
||||||
|
|
||||||
**Q37 [P4]** Are collaborative/co-authorship workflows in scope?
|
**Q37 [P4]** Are collaborative/co-authorship workflows in scope?
|
||||||
|
|
||||||
|
|||||||
+4
-1
@@ -23,6 +23,7 @@ Exit criteria:
|
|||||||
- CI skeleton in place for formatting, linting, and tests.
|
- CI skeleton in place for formatting, linting, and tests.
|
||||||
- ADR for project revision lifecycle (draft/publish/supersede).
|
- ADR for project revision lifecycle (draft/publish/supersede).
|
||||||
- ADR for composable extension mechanism (shape, namespacing, discovery).
|
- ADR for composable extension mechanism (shape, namespacing, discovery).
|
||||||
|
- ADR for persistence layer architecture (PostgreSQL as primary target; repository abstraction pattern; query library selection; SQLite future-option strategy).
|
||||||
- Initial repository layout includes dedicated locations for API contracts and extension schemas.
|
- Initial repository layout includes dedicated locations for API contracts and extension schemas.
|
||||||
- Documented answer to Q38: personal data categories and lawful basis for each (prerequisite for any user data model work).
|
- Documented answer to Q38: personal data categories and lawful basis for each (prerequisite for any user data model work).
|
||||||
- Draft privacy notice template and operator guidance (prerequisite for any public-facing instance).
|
- Draft privacy notice template and operator guidance (prerequisite for any public-facing instance).
|
||||||
@@ -36,6 +37,7 @@ Goals:
|
|||||||
- Basic publishing lifecycle (draft, published, updated).
|
- Basic publishing lifecycle (draft, published, updated).
|
||||||
- Search and browse within one instance.
|
- Search and browse within one instance.
|
||||||
- Support extension payloads in project data model without requiring domain-specific first-party implementations.
|
- Support extension payloads in project data model without requiring domain-specific first-party implementations.
|
||||||
|
- Support extension payloads on material entries using the same extension mechanism, enabling domain-specific material attributes (yarn weight, filament profile, component spec) without changing the core material record.
|
||||||
- User-level personal moderation: block and mute individual accounts; keyword and wildcard content filters. No moderator approval required.
|
- User-level personal moderation: block and mute individual accounts; keyword and wildcard content filters. No moderator approval required.
|
||||||
- Personal collections and bookmarks (private by default).
|
- Personal collections and bookmarks (private by default).
|
||||||
- Self-service data export (right to access) and account deletion (right to erasure) with defined retention and purge windows.
|
- Self-service data export (right to access) and account deletion (right to erasure) with defined retention and purge windows.
|
||||||
@@ -45,7 +47,8 @@ Exit criteria:
|
|||||||
- A user can publish and update a complete project end-to-end.
|
- A user can publish and update a complete project end-to-end.
|
||||||
- Project pages are discoverable and readable on one node.
|
- Project pages are discoverable and readable on one node.
|
||||||
- Test coverage exists for core behavior paths.
|
- Test coverage exists for core behavior paths.
|
||||||
- At least one extension payload can be stored, validated, and rendered as non-breaking optional data.
|
- At least one project-level extension payload can be stored, validated, and rendered as non-breaking optional data.
|
||||||
|
- At least one material-level extension payload can be stored, validated, and rendered as non-breaking optional data.
|
||||||
- A user can block another user and have that block take effect immediately without moderator involvement.
|
- A user can block another user and have that block take effect immediately without moderator involvement.
|
||||||
- A user can bookmark and organize projects into personal collections.
|
- A user can bookmark and organize projects into personal collections.
|
||||||
- A user can export all their personal data without admin involvement.
|
- A user can export all their personal data without admin involvement.
|
||||||
|
|||||||
Reference in New Issue
Block a user