[GSoC 2023] UI tool for fetching online content

a.paritosh · May 5, 2023, 5:53am

Brief summary of Project
The project aims to design a FreeCAD models library management tool that can fetches the content online from different repositories, without the need to download the whole library. The library management tool to be developed should be able to :-

browse the existing dataset of all the models
download individual models
manage user models
insert components into FreeCAD document
structured online repository storage system
local storage structure

Project Description (Detailed)

Introduction
A library management add-on for managing Different online-offline FreeCAD components is a utility application for FreeCAD which allows an easy and structured management of components/models available online or offline and to easily be inserted into the FreeCAD document.
Its basic functionalities is to facilitate individual component from the online repository to be able to browsed, downloaded, and extended by already available local components.
For a Library management system, a robust repository of data and an local storage structure is must. This is important as user need to brows through the all the existing models which are the part of the repository and download only selective models. A structured local storage will not only also allow users to easily add their own models to the local library management system, but also to the online repository open to public. For maintaining the metadata of these, a well defined storage structure is required.
All this need to have an proper interface to be operable. The interface for the above will be developed as both a Graphical Interface and module for python scripting.

Information and current state
At current state, the online repository is hosted on GITHUB and is available to be download through a add-on of FreeCAD which downloads the whole repository which is of very big size 1.5GB+ .

Functional Requirements

Browsable online repository
All the components present in the open-source online repository will be made browsable to the user. The components will be loaded in chunks as the user scrolls through the models. The models can be previewed in a grid view and its GUI is discussed in further sections.
Following functionalities will be developed for easy browsable library of parts :-

Sorting
Filtering
Searching
The information required for thumbnails, sorting, filtering, etc. will be fetched in chunks from the metadata of the components in the repository.
Loading data in chunks will make the user experience smooth as fetching time will be less.

Individual Downloadable components
The components being browsed will be downloadable individually. These components can be reviewed before through thumbnail and meta data before being downloaded.
Once downloaded, the component will be stored along with the other components in the Local Storage Structure which is discussed in further sections.

Adding user components to the library management system
All the downloaded components will be visible in the My Models section of the GUI which will be discussed in the further sections. But the user will be able to add their own components to the management system.
This will be made possible by adding the user components to the Local Storage Structure along with the metadata for a unified and structured storage and management of components.
This will make it easy for the user to maintain their own local repository of the components which can be accessed, browsed and used from a single unified interface.

Metadata management
Metadata is the data about data(in out case components). It helps understand the data behind and provides information about the components. This are supporting data which need to be independent of the the components itself and helps users and system by providing with additional information which are mandatory.
The metadata of the Components Library Management system is important for browsing through a large amount of components present in repository as the component itself is not needed to be fetched for browsing before it is downloaded. For this reason the metadata is to be available an downloadable irrespective of the component itself. This will be done by maintaining a separate database of all the components with unique component-ID. This metadata will facilitate searching, sorting and filtering
List of Metadata :-
Required:

name
version
maintainer (email)
license (file)
URL (component)
created on
modified on
type (type of file)
size (component size)
Optional:
author (can be multiple, emails)
thumbnail
description
rating (out of 5)
tags (can be multiple)

Insertion of available local/downloaded components to the document
The already available components that is needed to be added can be done through the interface. The user need to specify the component path and add some mandatory metadata before the component can be made available thought the Library Management Tool. The addition of this new component will follow the structure of the Local Storage.

Uploading of user models to online repository
The user can submit their own models to the online repository can contribute to the open-source. There will be compatibility between the online Repository of components and the Local Storage structure which is discussed below.

Interface

Graphical User Interface
Below is the structure and parts of the GUI of Library Management tool.

Wireframe :-

Python Module
A python module will be developed for facilitating the scripting of various parts the library management system. It will include all the functionalities of functional requirement and insertion of components into script.
Below design of the modules and the sub-modules.

Database of open-source models (online Repository)
A new structured way of storing components is proposed which will separate the maintenance component from management of components for online repository. The basic schema of the proposed solution is shown below.

_Complete schema of metadata and tags is discussed below_In this structure, maintenance of components can be done independently from the management of repository. This is possible as the components can be from different locations or repositories, but there data can be maintained and tracked from the management section.
Every component will have a URL which can be used to track it and as the source to download it. All the additional and meta data can be stored and maintained in a database with the above specified schema.

Local Storage Structure
Local storage structure needs to be compatible and maintainable with the online repository. Its structure is similar to the structure of the online database of the components so that components can be easily downloaded and uploaded without and conflicts.
Database schema of the local storage structure is shown below.

_Complete schema of metadata and tags is discussed below_The storage to components on the filesystem of the local user machine will be as shown below.

Metadata Storage Structure

Tools and Technology
• Python
• PyQt/PySide (as required)
• Qt Designer
• Qt Creator
• git (for maintainance)
• sqllite

Documentation
All the python modules will be will documented following PEP 8 – Style Guide for Python Code.
Additionally, markdown files will supplement the user guidelines and additional data or documentation required.

Working Schedule
Till May 28 (Analysis and Design Period)
• Requirement Analysis
◦ Going trough code base
◦ Coding norms and standards
◦ Discussion with mentor
◦ exploring FreeCAD software
• Creating Software Requirement Specifications Document
◦ introduction
◦ Overall Description
◦ Interface requirements
◦ System Features(Functional requirements)
◦ Non functional requirement
• Creating Design document
◦ Creating modules structure
◦ Class Object Diagrams
◦ Use-case diagrams
◦ Data-flow Diagrams
• Discussion of Final Development Planning, strategies and structures with mentor
May 28 to July 14 (Development Phase) – 7 weeks
May 28 – June 10 (2 weeks)
• Designing of online repository structure
• Designing of local storage structure
• Synchronising/porting existing data to the new system
• Testing
June 11 – July 1 (3 weeks)
• Designing of modules
◦ making a module to insert component in the FreeCAD document
◦ designing processes for specified functional requirements
• Designing of API
• Running unit tests
• Running integration tests
July 2 – July 14 (2 weeks)
• Designing GUI interfaces
◦ making UI files
◦ gathering UI resources
◦ testing the UX and integrity/ responsibility
• Implementing and integrating python modules with the GUI
◦ making threads for different concurrent services
◦ making a pagination/ chunk loading of components
◦ interfacing the modules with their respective UI components

July 14 to August 21 (Clean-up and wrap-up)
• Code Clean-up
• Write a blog for public visual
• System Testing
• Integration Testing
• Documentation
◦ revising the module documentation-strings
◦ making the user manual
◦ making a documentation of the library from module doc-string and additional data
• Packaging the application
• Final Code submission

Time Availability
• I can devote 40-50 hours per week
• I will spend more time if needed also will carry on extensive experiments

Why FreeCAD?
I am motivated to do open source contribution as I really admire the open-source community and try to use as much open-source software as I can. I have also used FreeCAD before while just tinkering around and found it the best open-source alternative to other proprietary software out there. I am eager to develop solutions and add value to the community and my resume. By participating in GSOC I intend to strengthen my technical skill but more than anything else I want to make Linux operating system more appropriate to day to day use and want more and more people to use it. Through FreeCAD, I want professionals of the relevant fields to give Linux a try and be a part of open-source community.

jimb · May 7, 2023, 2:20pm

List of Metadata :-
Required:
…
4. license (file)

Suggest adding SPDX license identifier. That could make any searches by license easier/possible. Also easier for people to to quickly identify license as they have short well defined names like “LGPL-3.0-or-later”. More info in SPDX FAQ.

Some things may be available (licensed) under multiple licenses.

yorik · May 8, 2023, 8:43am

Welcome on board Amulya!

The plan is great!

Maybe you could also think of how we could deal with the current Parts library at https://github.com/FreeCAD/FreeCAD-library
A 1.5Gb Git repo is too large, git was not really designed for that, and also being able to version control the files is not really super important here, ans they almost never change. Would there be a better solution? FreeCAD has money now. We could host this somewhere else, buy storage space, etc. But would that be a good solution? Wouldn’t it be better to split the library? Maybe you can help us discuss these things.

In any case, it is likely that, even if we move out of github at some point, in the future people would create other libraries with Git, so supporting Git platforms is certainly necessary.

a.paritosh · May 8, 2023, 11:13am

Welcome on board Amulya!

The plan is great!

Maybe you could also think of how we could deal with the current Parts library at GitHub - FreeCAD/FreeCAD-library: A library of Parts for FreeCAD. WARNING - This library is huge. It might take a long time to download and make the addons manager unresponsive for many minutes.
A 1.5Gb Git repo is too large, git was not really designed for that, and also being able to version control the files is not really super important here, ans they almost never change. Would there be a better solution? FreeCAD has money now. We could host this somewhere else, buy storage space, etc. But would that be a good solution? Wouldn’t it be better to split the library? Maybe you can help us discuss these things.

In any case, it is likely that, even if we move out of github at some point, in the future people would create other libraries with Git, so supporting Git platforms is certainly necessary.

Yes supporting Git platforms is must, but I think a flexible solution for this would make more sense which can support not only one but multiple repositories. I have an idea for this and have discussed it in my proposal (above).

Basically rather than hosting the whole components repository ourself, we can just store the components URL with their metadata in a database. This will allow users to even add custom components through their personal (public) repository.

But I am still a bit uncertain about this approach! Suggestions are welcomed.

wandererfan · May 8, 2023, 2:07pm

LibreOffice has “Open Remote” function that allows reading from network shares, document management system urls, etc. There might be some inspiration there.

chennes · May 8, 2023, 4:15pm

If you have not already done so, I suggest reviewing this discussion, to get a sense for what people are saying they want. https://devtalk.freecad.org/t/rethinking-the-part-library/55581/1

a.paritosh · May 12, 2023, 6:18am

What are your opinions on using a tags system for grouping components rather that using a tree structure as present in current GitHub repo?

There are several advantages of using Tags system over Tree based Hierarchical system like :-

Tags provide a more flexible way of categorizing and grouping components. Unlike a tree structure, where each component can only exist in one specific category, tags allow components to have multiple tags associated with them.
Components can be associated with tags based on various characteristics, properties or relationships
This allows for more fluid and intuitive grouping based on different criteria.
With a tag system, it is easier to adapt and modify the grouping of components as requirements change over time.
This can allow more fine and dynamic hierarchy.
Adding or removing tags to components is a more straightforward process compared to restructuring a tree hierarchy.
Flexibility is beneficial when dealing with a large number of components and when the classification criteria are subject to frequent changes.
Tags can enhance search, retrieval and filtering capabilities by allowing components to be associated with multiple tags.

Zolko · May 12, 2023, 11:26am

I think that’s a very good idea, but the danger is the use of different yet similar tags for the same subjects: Bearing –vs– bearing, CHC –vs– ISO 4762, extruded profile –vs– beam … So you must think of a way that part designers choose by default existing tags, even though they must be able to create new tags for really new parts

a.paritosh · May 12, 2023, 3:07pm

OK… From what I understood, there will be two kind of similarities,

Spelling Based Similarities

Like in the example you gave of Bearing & bearing, these two tags are different just on the bases of case.
Solution : This can easily be countered by using just the lower case for the tags.

Logical Similarities

The other example, extruded profile & beam, these two or more such tags are logically same and refer the same item/type.
Solution : There can be two solutions to this :-

There can be some predefined tags but in case of the unavailability of the required tag , the user could request the required tag. For this, there needs to be an approver who will approve the requested tags
CASE 1(If similar tag is available): The approver will replace the required tag with the available similar tags
CASE 2(If no similar tag is available): The approver will add a new tag for the same

Pros

This will ensure the quality and maintainability
More reliability

Cons

This requires a dedicated person who will do this job

An algorithm that can group tags.
Users can add whatever tags they want. The UI will suggest the tags based on

what they are typing
top n tags suggestion

top n suggestion can be made by filtering the top most related tags from the already specified tags.

Algorithm :
all tags will have a weight with every other tag
weight of tag i = (no. of times tag i is used with the tag) / (total no. tags associated with the tag)
this weights will be updated periodically

Pros

the system will be self learning
requires little to no maintenance in long run
will follow the trend

Cons

system will take time to learn
might require maintenance in starting

But should this be the scope of this project right now or can it be developed in next iteration?

chennes · May 12, 2023, 3:12pm

This came up a bit at the Brussels meeting this year with regard to Addons, and as Zolko notes, the big challenge isn’t the technology, it’s the humans . You are correct to note that this is itself actually quite a big project. I think we can view tag management as its own project that you are welcome to work on if it interests you, or that can be left to a later developer. In which case I propose a closed tag system based on a flat text file hosted on GitHub for the time being. Adding tags to that file adds them to the interface.

a.paritosh · May 12, 2023, 4:22pm

As in the proposal, I am planning to implement tags in database too. What are the suggestion on using database for storage of both metadata and its related tags?

The ER model for the same is given in the proposal.

chennes · May 12, 2023, 10:43pm

I think you can only put tags in a database if you are prepared to implement a tag management UI. Otherwise it is too difficult for someone to propose new tags, and you basically have to use an open system, which will be a mess.

I think you need to really consider whether a SQL database is the right direction at all. It has a number of major downsides, including requiring another authentication setup, and making it much more difficult for an individual or company to host a private repository of parts. I am very interested in others’ feedback on this point.

a.paritosh · May 13, 2023, 4:15am

The system is designed for flexibility so that it is easy for any individual or company who has an online repository of parts (maintained by themselves). They just need to specify the URL to the component and fill the required metadata fields.
I think I might be missing some point on how will this be difficult for individual/company to add new parts to the system.

The database will be exposed through an API framework (like Flask, FastAPI, etc) so the authentication can be handled in that layer.

yorik · May 17, 2023, 8:39am

Like in most open-source projects, we like to keep things as simple as possible in FreeCAD… We must always think about other developers working on top of our work, and possibly too non-developers, that is, just FreeCAD users with little or no programming knowledge annoyed by something and deciding to try to fix it themselves
If you are designing some form of database, I’d go for something VERY simple like a text file or a csv file. Something the FreeCAD UI or any other UI or script can easily fetch and parse, with no extra depencency or hassle.

a.paritosh · May 18, 2023, 11:13am

Ok I get it. I think using of csv files will be much better as it will preserve the original idea of using tables based storage and will also be easy to understand and parse. And what about using an ORM on top of it?
From what I am thinking,

using ORM we can achieve a certain level of abstraction
make code more pythonic
relationship management can be easier
It might also increase the readability of the code

yorik · May 19, 2023, 8:54am

Could be, but keep in mind:

Even with a “database” structure to store some metadata such as tags, most of the info you’ll need to retrieve will still be stored into files. For example, the size, the number of objects, the type of model, the thumbnail of a FreeCAD model, all this you’ll retrieve by reading files, not accessing a database
We must expect people simply throwing models into the library, like they do now. They won’t fill forms, they won’t format things properly, etc. This is something good, it’s like a wiki, the low entry barrier permits that people contribute a lot. But we must not rely too much on very clean/predictable data structure
Also keep in mind to keep new dependencies low or even better, not add any new dependencies. When designing a new system that will be built in FreeCAD, this is a crucial point. Each new piece of software FreeCAD will depend on needs to be analysed very carefully in terms of license compatibility and general multiplatform availability, and as much as possible avoided, because it gives extra headache for maintainers and packagers.

Basically I would try not to design something too complex for now, as it could become counterproductive later on.

chennes · May 19, 2023, 3:55pm

I’m not sure the dependency problem is as acute as yorik is concerned about: the database part of this project is server-side, so doesn’t need to be installed by end users. My understanding of the proposal is that even if a.paritosh decides to use a database backend, the access will be via a REST API, so the frontend in FreeCAD doesn’t need to know anything about it.

My whole concern about the database boils down to two things:

First, long-term maintainability of the server itself. We will need to keep the database software up-do-date, etc., which imposes extra work (mostly on kkremitzki probably, though I should be able to help).

Second, if a company wants to have a fully private component library, they would need to build their own database server, which is a big ask compared to asking your IT team for some company-wide file storage space.

berniev · May 27, 2023, 3:05am

The objections to database seem to be based on fear of traditional server hosted monoliths run by people in white coats.

Sqlite is a modern alternative. It is single file. It is ubiquitous. Yet can be huge.

For storing and retrieving data, nothing beats a database. Sure beats mucking about with XML file(s). Edit: Or csv files..

berniev · May 27, 2023, 4:33am

And Sqlite is directly supported by Python out of the box. (Sqlite3)

yorik · May 27, 2023, 3:18pm

okay i’m convinced