Let me share a scenario I personally encountered. One day, I came across an introduction to a Star Wars animation on a video recommendation website. I was drawn to the fancy covering image, which briefly flashed by in the homepage slides. However, I got busy with other things and only had time to search for it later that evening. Without this project, such information would have been almost impossible to find back because, as part of the website’s recommendation system, the content changes every time you refresh the web page. But this time was different, I was able to retrieve the content by searching for the keyword “Star War” and narrowing the search by time range.
Of course, I know that such a feature might seem trivial. Some things are simply forgotten and that’s fine. But what if it is a more important clue, like a bug for the web site which only trigger in some narrow condition and hard to reproduce.
Memos is a privacy-focused passive recording project. It can automatically record screen content, build intelligent indices, and provide a convenient web interface to retrieve historical records.
This project draws heavily from two other projects: one called Rewind and another called Windows Recall. However, unlike both of them, Memos allows you to have complete control over your data, avoiding the transfer of data to untrusted data centers.
"large corpus of unsecured unsecured local data" is this much worse than unencrypted outlook mailbox (pst or est)? Or offline files from your Dropbox/GDrive/etc? Or your browser profile?
I guess it's worse in the sense that it also records audio, but large corpus of information is already at risk on a unsecure or compromised devices
Rewind and Recall also store similar data locally but maybe not only locally. And Recall/Rewind allow data deletion, they can retain the most recent data based on time.
Rewind and Recall are 2 separate projects and 2 separate installers. I use Rewind and I have several outbound network monitoring apps as well as local disk monitoring apps. Rewind does not send data offsite.
Rewind does glitch sometimes specifically with audio recording which is extremely annoying. You go back to an area where you thought you had audio notes only to find you didn’t - even though you had audio recording turned on the whole time. It has something to do with meeting detection. Which is silly bc disk space is cheap just auto record. I do like the concept of an open source version and I will look into this.
If this is very important, I suppose I will implement encryption for stored data in future versions.
However, I still have a question about this: it seems that lots of hard disk is already encrypted. After all, I also store a large amount of personal photos, documents, bills, and other important information on my computer, and I haven’t meticulously encrypted all this data again. Should I be doing that?
strongly recommend you check out the built in Swift APIs for screen capture and OCR. They’re heavily optimized for energy usage, and allow much finer grained controls on what apps are white/blacklisted for privacy
Ah, I see the commit that renamed the repo[1] because the title says "Memos" and the URL says "/memos" but the repo was different. I similarly got confused while reading the readme thinking Pensieve was a dependency or something
Sorry for the confusing. I gave a bad name "memos" for the project. But there is a great open source project named "memos" over there. So I quickly changed the name to "Pensieve".
Have you used much python, or are you just buying into the "python slow" memes?
Unless they've done something very very wrong performance will be fine. This isn't doing anything where python's overhead would matter.
It's glueing together some highly optimized code written in other languages, or using python as a DSL to interface with highly optimized libraries like numpy, or generate highly optimized assembly with something like JAX, or if they're really fancy compiling a restricted subset directly to GPU shaders or something.
Python is plenty fast for most stuff, and when it isn't it has one of the best pathways towards optimization.
Great work, op. As others have said, encryption is vital to such a project. In fact if your ethos is privacy, it would be great marketing material to assure users that this is in fact resistant to basic infiltration.
I think recall is a fantastic idea, even for professionals and corporate env. But the kind of sensitive information that is handled by employees cannot risk being leaked from such a tool.
I deleted Rewinds over the weekend, after I noticed it had eaten 20gb of storage.
I hadn’t used it since installing it. So that came as a surprise.
I then tried using it, and couldn’t get it to find things I knew were in my history. (Basic keywords match)
So I deleted it.
I like the idea of this app. It ticks all the boxes. But I haven’t found any value on this category of app yet.
Let me share a scenario I personally encountered. One day, I came across an introduction to a Star Wars animation on a video recommendation website. I was drawn to the fancy covering image, which briefly flashed by in the homepage slides. However, I got busy with other things and only had time to search for it later that evening. Without this project, such information would have been almost impossible to find back because, as part of the website’s recommendation system, the content changes every time you refresh the web page. But this time was different, I was able to retrieve the content by searching for the keyword “Star War” and narrowing the search by time range.
Of course, I know that such a feature might seem trivial. Some things are simply forgotten and that’s fine. But what if it is a more important clue, like a bug for the web site which only trigger in some narrow condition and hard to reproduce.
Memos is a privacy-focused passive recording project. It can automatically record screen content, build intelligent indices, and provide a convenient web interface to retrieve historical records.
This project draws heavily from two other projects: one called Rewind and another called Windows Recall. However, unlike both of them, Memos allows you to have complete control over your data, avoiding the transfer of data to untrusted data centers.
> avoiding the transfer of data to untrusted data centers
In short order, this will create a large corpus of unsecured local data.
Is the user expected to secure the data independently?
Do Recall/Rewind help the user to filter recorded data for retention or deletion?
"large corpus of unsecured unsecured local data" is this much worse than unencrypted outlook mailbox (pst or est)? Or offline files from your Dropbox/GDrive/etc? Or your browser profile?
I guess it's worse in the sense that it also records audio, but large corpus of information is already at risk on a unsecure or compromised devices
Rewind and Recall also store similar data locally but maybe not only locally. And Recall/Rewind allow data deletion, they can retain the most recent data based on time.
Rewind and Recall are 2 separate projects and 2 separate installers. I use Rewind and I have several outbound network monitoring apps as well as local disk monitoring apps. Rewind does not send data offsite.
Rewind does glitch sometimes specifically with audio recording which is extremely annoying. You go back to an area where you thought you had audio notes only to find you didn’t - even though you had audio recording turned on the whole time. It has something to do with meeting detection. Which is silly bc disk space is cheap just auto record. I do like the concept of an open source version and I will look into this.
Thanks to PR debacle, Recall now encrypts the data in a VM, https://www.windowscentral.com/software-apps/windows-11/wind...
If this is very important, I suppose I will implement encryption for stored data in future versions.
However, I still have a question about this: it seems that lots of hard disk is already encrypted. After all, I also store a large amount of personal photos, documents, bills, and other important information on my computer, and I haven’t meticulously encrypted all this data again. Should I be doing that?
It’s a question of risk.
Full disk encryption targets a different threat model - disk encryption protects against someone grabbing your computer.
Writing into an encrypted blob on disk adds a layer of protection against bad actors exfiltrating data by running code on the laptop.
Overall I really am amazed that this sort of thing is now possible and appreciate a privacy-aware / local compute and storage version of it!
strongly recommend you check out the built in Swift APIs for screen capture and OCR. They’re heavily optimized for energy usage, and allow much finer grained controls on what apps are white/blacklisted for privacy
Thanks for the advice I will do more about this part. Currently I am using a package named "ocrmac" it helps a lot.
Ah, I see the commit that renamed the repo[1] because the title says "Memos" and the URL says "/memos" but the repo was different. I similarly got confused while reading the readme thinking Pensieve was a dependency or something
1: https://github.com/arkohut/pensieve/commit/e81057d5bebcf9cab...
Sorry for the confusing. I gave a bad name "memos" for the project. But there is a great open source project named "memos" over there. So I quickly changed the name to "Pensieve".
How’s the performance with Python? What’s the overhead?
Have you used much python, or are you just buying into the "python slow" memes?
Unless they've done something very very wrong performance will be fine. This isn't doing anything where python's overhead would matter.
It's glueing together some highly optimized code written in other languages, or using python as a DSL to interface with highly optimized libraries like numpy, or generate highly optimized assembly with something like JAX, or if they're really fancy compiling a restricted subset directly to GPU shaders or something.
Python is plenty fast for most stuff, and when it isn't it has one of the best pathways towards optimization.
Great work, op. As others have said, encryption is vital to such a project. In fact if your ethos is privacy, it would be great marketing material to assure users that this is in fact resistant to basic infiltration. I think recall is a fantastic idea, even for professionals and corporate env. But the kind of sensitive information that is handled by employees cannot risk being leaked from such a tool.
Thanks for the advice. I will work on this feature.