I can't use it because I'm not classified as "human" by a computer. There is no captcha that I could get wrong, just a checkbox that probably uses a black box model to classify me automatically
Was curious after the post claimed that the quality is better than Google and DeepL, but the current top comment showed translations from Afrikaans that it got wrong but I could understand as a Dutch person who doesn't even speak that language (so it's not like seven levels of negation and colloquialisms that they broke it on)
What do I do with this "Error Code: 600010"? I've submitted a "report" but obviously they're not going to know if those reports are from a bot author frustrated with the form or me, a paying customer of Kagi's search engine. The feedback page linked in the blog post has the same issue: requires you to log in before being able to leave feedback, but "We couldn't verify if you're a robot or not." The web is becoming more fragmented and unusable every day...
Cloudflare is more or less a necessity if you offer any sort of computationally expensive service for free. They're problematic for sure, but I think they're a lesser evil in the grand scheme of things.
Very much a symptom of a much larger problem however, one with not a lot of good solutions.
I had tons of issues with these Cloudflare checkboxes. I finally figured out it was because I use this extension [1] that disables HTML5 autoplay. I assume Cloudflare is doing some kind of thing where they verify that the client can playback media, as they assume that headless browsers or crawlers won't have that capability
Interactive captchas are one foot in the grave. With multi-modal tool-using AI models proliferating, challenge tasks that only a human can complete are vanishing. Current challenges exclude users with minor physical or mental impairments even now.
Anti-bot filters will require a different signal to determine if a physical human actually made the request.
> Afrikaans that it got wrong but I could understand as a Dutch person who doesn't even speak that language
The one language is basically a derivative of the other. Understanding and judging accuracy of translation are quite different though.
eg for me it’s the reverse - can understand large parts of Dutch due to Afrikaans. Couldn’t tell you if a Dutch sentence is correctly translated or grammatically correct though
Thanks, but I am logged in and it still shows that. Clicking log in at the top of the page leads me to the login page which takes about 10 seconds to (while I'm typing) realise that I'm already logged in and then redirects me to the homepage (kagi search)
I don't have any site-specific settings and clearly HN works fine (as well as other sites) so it's not that cookies are disabled or such
Edit: come to think of it, I'm surprised that you find translator data to be more sensitive (worth sticking behind a gatekeeper) than user logins. Must have been a lot of work to develop this intellectual property. There is no Cloudflare check on the login page. Not that I'd want to give you ideas, though! :-)
> come to think of it, I'm surprised that you find translator data to be more sensitive (worth sticking behind a gatekeeper) than user logins. Must have been a lot of work to develop this intellectual property. There is no Cloudflare check on the login page.
This is just a simple anti-bot measure so we do not get hammered by them to death (kagi does not have an infinite treasure chest). It is not needed for search, because you can not use search for free anyway.
At least for Afrikaans I'm not impressed here. There are some inaccuracies, like "varktone" becoming "pork rinds" instead of "pig toes" and also some censorship ("jou ma se poes" does NOT mean "play with my cat"!). Comparing directly against Google Translate, Google nails everything I threw at it.
I didn't see any option to provide feedback, suggested translations, etc, but I'm hopeful that this service improves.
Just tried translating your comment to German. Kagi took a very literal approach, keeping sentence structure and word choice mostly the same. Google Translate and DeepL both went for more idiomatic translations.
However translating some other comments from this thread, there are cases where Kagi outperforms others on correctness. For example one comment below talks about "encountering multiple second page loads". Google Translate misunderstands this as "encountering a second page load multiple times" while DeepL and Kagi both get it right with "encountering page loads of multiple seconds" (with DeepL choosing a slightly more idiomatic wording)
I asked some inappropriate things and it was "translated" to I cannot assist with that request. It definitely needs to be more clear when it's refusing to translate. But, then again, I don't even use kagi.
This is kind of natural for a new service, one of the advantages the big players have is a giant test corpus. For less mainstream languages and terms it will be more noticeable.
"The game is my poem" when back-translated from the Turkish translation, "oyun benim şiirimdir". And there's censorship too when doing EN-TR for a few other profanities I tested. When you add another particular word to the sentence, it outputs "play with my cat, dad".
Just as a quick usability feedback: As long as Deepl translates asynchronously as I type, while Kagi requires a full form send & page refresh, I am not inclined to switch (translation quality is also already too good for my language pairs to consider switching for minor improvements, but the usability/ speed is the real feature here).
This is coming from a user with existing Kagi Ultimate subscription, so I'm generally very open to adopt another tool if it fits my needs).
Slightly offtopic, slight related: As already mentioned the last time Kagi hit the HN front page when I saw it: the best improvement I could envision for kagi is improved search performance (page speed). I still encounter multiple second page loads far too frequently that I didn't notice with other search engines.
Interesting, I'm actually annoyed that DeepL sends every keystroke and I'm using idk how many resources on their end when I'm just interested in the result at the end and for DeepL to receive the final version I want to share with them
That it's fast, you don't have to wait much between finishing typing and the result being ready, that's great and probably better than any form system is likely to be. But if it could be a simple enter press and then async loading the result, that sounds great to me
> As long as Deepl translates asynchronously as I type, while Kagi requires a full form send & page refresh,
This leads to increased cost and we wanted to keep service free. But yes we will introduce translate as your type (will be limited to paid Kagi members).
We are focusing most our resources on search (which I hope you can agree, we are doing a pretty good job at). And it turns out search is not enough and you need other things - like maps (or a browser, because some browsers will not let you change search engine and our paid users can not use the service). Both are also incredibly hard to do right. If it appears quarter-baked (and I am first to say that we can and will definetely keep imporivng improving with our products), it is not for the lack of trying or ambition but the lack of resources. Kagi is 100% user-funded. So we need users, and we sometimes work on tools that do not bring us money directly, but bring us users (like Small Web, Universal Summarizer or Translate). It is all part of the plan. And it is a decade-long plan.
I absolutely did not mean to imply that you did not want to improve the products.
I did assume that you are missing the resources for the many products you develop.
Its just very sad to show/recommend Kagi to people and then have them(or me) run into so many bugs, and sometimes product-breaking bugs. (such as Maps that I mentioned. because I would love to use Kagi maps, but its so broken that I just cant)
Would love to travel 10 years into the future of Kagi's roadmap.
Mozilla is digging a grave for Firefox with its own hands, we need something else non-chromium, which Orion serves fine. I just wish they'd focus on fewer things as has already been mentioned, rather than producing many half-baked things.
I beg to differ. The web does not need more ad-supported browsers. Orion is built for the users and paid for by the users.
It stands against all that is wrong with the web today - advertisers and third parties tracking at every step - perhaps like Firefox used to do 15-20 years ago.
I think the community needs more browser options, sure, but I don't understand why Kagi needs Orion. It seems like a distraction to your core competencies. I'm curious if the writing is on the wall for search alone?
love the plan, but i’d suggest being more up front with users on how “finished” a product is.
With the maps example, you run into problems because of expectations. If you slap a BETA or ALPHA logo on the maps product, expectations will be lower, and people are more forgiving of issues while you continue improving the product. Or if it’s only good in the US (just an example), make it clear somehow when searching for addresses outside the US.
As a paid kagi user, that might be because I tried kagi maps once, went "yep that's crap" and from then on went to Google for any maps related search.
However, I neither expect nor need kagi to have a perfect replacement for every single google product. I'd rather it focus on creating better versions of the things that google is bad at (especially basic search) rather than trying to provide bad versions of the things Google is good at (maps, translate).
As a kagi early adopter… why would I bug report on a feature I actively avoid using?
I can totally recommend search to anyone, but I agree with others in this chat that most toys feel beta. I’m glad to have them but can’t recommend them.
For maps, your goals of being ad free go against what I need from maps search. 90% of the time I search for restaurants, museums, businesses, opening hours, phone numbers of various local shops. People add that data to google, and not that many other maps services :(. That is where they advertise how to be contacted. Addresses and directions are really secondary to a maps search.
Like others here are saying, Kagi maps is so far behind that I wouldn't bother with any bug reports or feedback. I tried it just now, was panning around in a region in Europe, clicked the "Hotels" button to see what it would present and get sent to a town called Hotels in Palestina with a Wikipedia description of what a hotel is...
So the suggestion to slap a beta sticker over maps is a good one. Nokia, Microsoft and Apple have all tried to compete with Google Maps without succeeding. Do yourself a favour and start using the Google Maps API for Kagi Maps, that's probably the only way you can get all the important data. If the API is expensive, then charge more for maps. Kagi customers want the best product, and are willing to pay for it.
As an anecdatapoint, I have replaced the button with a redirect to Google maps. It's not worth trying to extract value from the Kagi one, I probably gave it a chance ~20 times and I don't think it did what I needed a single one. (In Scandinavia)
As a paying Kagi user, in my case, that's because I !g any query that I expect to give local results for, and I often go directly to maps.google.com for map results. The general search results are awesome, particularly in my tech bubble, since I don't have to see w3schools garbage and the like. Localized stuff, not so great, and maps I prefer to avoid.
(Edited to add: Though perhaps I should give maps a try again. They seem to have gotten better since I formed my muscle memory.)
I'm curious to see if I can identify what data source and search software it is based on, since I've heard similar complaints about Nominatim and it is indeed finicky if you made a typo or don't know the exact address; it does no context search based on the current view afaik. Google really does do search well compared to the open source software I'm partial to, I gotta give them that
Edit: ah if you horizontally scroll on the homepage there's a "search maps" thing. Putting in a street name near me that's unique in the world, it comes up with a lookalike name in another country. Definitely not any OpenStreetMap-based product I know of then, they usually aren't unliteral like that. Since the background map is Apple by default, I guess that's what the search is as well
>Quality ratings based on internal testing and user feedback
I'd be interested in knowing more about the methodology here. People who use Kagi tend to love Kagi, so bias would certainly get in the way if not controlled for. How rigorous was the quality-rating process? How big of a difference is there between "Average", "High" and "Very High"?
I'm also curious to the 1 additional language that Kagi supports (Google is listed at 243, Kagi at 244)?
They really must have copied Google, because like I said this was diffing exact strings, meaning that slight variations of how the languages are presented don't exist.
I am very suspicious of the results. A few months ago they published a LLM benchmark, calling it "perfect" while it actually contained like only 50 inputs (academic benchmark datasets usually contain tens of thousands of inputs).
I recently noticed that Google Translate and Bing have trouble translating the German word "Orgel" ("organ", as in "church organ", not as in "internal organs") to various languages such as Vietnamese or Hebrew. In several attempts, they would translate the word to an equivalent of "internal organs" even though the German word is, unlike the English "organ", unambiguous.
Kagi Translate seems to do a better job here. It correctly translates "Orgel" to "đàn organ" (Vietnamese) and "עוגב" (Hebrew).
DeepL also, for the record (since it's being compared in the submission)
It's pretty clear if you use the words out of context and they're true friends but it gets you the German translation of the English translation of whatever Dutch thing you put in. I also heard somewhere, perhaps when interviewing with DeepL, that they were working towards / close to not needing to do that anymore, but so far no dice that I've noticed and it has been a few years
Looks like the page translator wants to use an iframe, so of course the x-frame-options header of that page will be the limiting factor.
> To protect your security, note.com will not allow Firefox to display the page if another site has embedded it. To see this page, you need to open it in a new window.
This is a super common setting and it's why I use a browser extension instead.
I could be missing something, but is there some sort of metric for these comparisons to other software? Like the BLEU score which I've seen in studies relating to comparing LLMs to Google Translate. I find it difficult to believe it is better than DeepL in a vacuum.
Has anyone seen info on how this works? "It’s not revolutionary" seems like an understatement when you can do better then DeepL and support more languages then google?
I have some experience experimenting in this space; it's not actually that hard to build a model which surpasses DeepL, and the wide language support is just a consequence of using an LLM trained on the whole Internet, so the model picks up the ability to use a bunch of languages.
I'm almost sure they did not find tune an LLM. They are using existing LLMs because fine tuning to best the SOTA models at translation is impractical unless you target very niche languages and even then it would be very hard to get a better dataset than what is already used for those models.
Probably all they are doing is like switching between some Qwen model (for Chinese) and large Llama or maybe OpenAI or Gemini.
So they just have a step (maybe also an LLM) to guess which model is best or needed for the input. Maybe something really short and simple just goes to a smaller simpler less expensive model.
It varies depending on the language but I find GPT4o to be good into knowing the context and go sometimes with the intent not just the grammar and rules of the language. But for most cases it is an overkill and you still have the chance of hallucination (although it has less occurrence chances in these use cases)
This is of course based on my experience using it between Arabic, English and French which is among the 5 most popular languages. Things might be dramatically different with other languages.
And presumably the energy efficiency of a dedicated translator compared to a generic language system, assuming they didn't build this on top of a GPT. The blog post doesn't say but I'm assuming (perhaps that's no longer accurate) that it's prohibitively expensive for a small team without huge funding to build such a model as a side project
ChatGPT does better -- it picks up context and produces more idiomatic output.
Kagi translate does pick up context (for instance, "Anne is older than her sister Carmen" is a good test for languages that have different words for older and younger sister -- Google Translate gets this wrong all the time).
But the Kagi output is stilted and grammatically incorrect for say, Cantonese.
On the sidenote, does anyone know what methods Yandex use? When I'm trying to translate Kyrgyz, Google Translate shows utter trash that makes less sense than if they chose words at random. Yandex in comparison is very impressive. What's the secret ingredient?
There is no secret ingredient, Yandex just put more effort into supporting languages that are spoken in ex-USSR countries, because that's the most important market for them.
Other translation tools do not consider i. e. Kyrgyzstan an important market therefore do not put much effort into supporting Kyrgyz.
One thing I like about google translate that nether deepl or this do is tell me how to say the word. I mainly use it to add a reading hint to an otherwise opaque japanese title in a database.
Any plans to include that directly into android app? Feels fairly obvious to me as a user - I went there automatically after the news dropped, but.... not there yet ;)
Two text fields with send button. Accessible from hamburger menu. Or as another button below search entry box. Search box transforms into the first language text field.
Is that a relevant username, or is J your initial? I can't quite place what "JavaScript heard" would mean. I've wondered before but there's no contact in your profile and now it felt at least somewhat related to the comment itself, sorry for being mostly off-topic
I also strongly suspect the way they're able to make it free is by caching the results, so each translation only happens one time regardless of how many requests for the page happen. If they translated dynamic content, they couldn't (safely) cache the results.
I don't think JS vs HTML would make any difference to caching.
If they are caching by URL you can have dynamic HTML generation or a JS generated page that is the same on every load.
If you are caching by the text then you can do the same for HTML or JS generated (you are just reading the text out of the DOM when the JS seems done).
Yeah, js can be static or dynamic, so its not just whether it's js that matters. It's whether the content is added or modified after initial rendering that makes it dynamic.
Most js heavy pages retrieve data from APIs, and the static parts of the code is just layout and menus, which isn't the part that people care most about translating. Thus why GP said "added via Javascript later." The important part of that isn't the "Javascript" , it's the "later."
Ah, that makes sense. In my head it sounded like server-side dynamic content OR not wanting to translate LLM outputs, neither of which makes sense or is possible.
that's what I think too, which kinda makes sense since it's a page, and not a browser plugin. If they implemented a browser plugin that would do what Google recently removed from their plugin, that would be a killer feature. (assuming they can then translate all html as it comes in)
Brave browser does it already though, but sometimes it's unusably slow.
What do you mean? Does any other translator have such a separate field that you could point to, or could you explain what you're missing?
When I want to give DeepL context, I just write it in the translation field (also, because it's exceptionally bad at single word translations, I do it even if the word should be unambiguous), so not type in "Katze" but "die Katze schnurrt" (the cat purrs). Is that the kind of thing you mean?
Of pre-LLM major online translators, Bing is the only one I've noticed that has a dropdown that offers standard/casual/formal, and produces multiple outputs if it detects masculine/feminine differences.
For LLM-based translators, it usually works if you add relevant details in parentheses.
Ooh, that sort! DeepL has a formality setting also, but outputting multiple options sounds great and isn't something I've seen (I've used Bing Translator before, maybe I forgot or maybe this language didn't support it)
Has anyone else noticed that Google Translate trips up a lot on GDPR cookie consent dialogs in Europe? I’ve often had to copy/paste the content of a web page because Google, when given the URL,couldn’t navigate past the dialog to get to the page content (or couldn’t allow me to dismiss it). Not sure if Kagi has solved this.
"Document Too Long
Document is too long to process. It contains 158 chunks, but the maximum is 256.
Please try again later or contact support if the problem persists."
I can't use it because I'm not classified as "human" by a computer. There is no captcha that I could get wrong, just a checkbox that probably uses a black box model to classify me automatically
Was curious after the post claimed that the quality is better than Google and DeepL, but the current top comment showed translations from Afrikaans that it got wrong but I could understand as a Dutch person who doesn't even speak that language (so it's not like seven levels of negation and colloquialisms that they broke it on)
What do I do with this "Error Code: 600010"? I've submitted a "report" but obviously they're not going to know if those reports are from a bot author frustrated with the form or me, a paying customer of Kagi's search engine. The feedback page linked in the blog post has the same issue: requires you to log in before being able to leave feedback, but "We couldn't verify if you're a robot or not." The web is becoming more fragmented and unusable every day...
>What do I do with this "Error Code: 600010"?
Cloudfare, the gatekeeper of the internet, strikes again.
The usual suspects are VPN or proxy, javascript, cookies, etc.
https://developers.cloudflare.com/turnstile/troubleshooting/...
Unfortunately, even with the error code, I doubt the above page will help much.
Cloudflare is more or less a necessity if you offer any sort of computationally expensive service for free. They're problematic for sure, but I think they're a lesser evil in the grand scheme of things.
Very much a symptom of a much larger problem however, one with not a lot of good solutions.
I had tons of issues with these Cloudflare checkboxes. I finally figured out it was because I use this extension [1] that disables HTML5 autoplay. I assume Cloudflare is doing some kind of thing where they verify that the client can playback media, as they assume that headless browsers or crawlers won't have that capability
[1] https://addons.mozilla.org/en-US/firefox/addon/disable-autop...
which is hilarious because I bypass all cloudflare turnstiles with 20 lines of python and 2captcha API.
It only defeats users.
These mainly exist to push responsibility to somebody else. Proper functionality is secondary.
Interactive captchas are one foot in the grave. With multi-modal tool-using AI models proliferating, challenge tasks that only a human can complete are vanishing. Current challenges exclude users with minor physical or mental impairments even now.
Anti-bot filters will require a different signal to determine if a physical human actually made the request.
> Afrikaans that it got wrong but I could understand as a Dutch person who doesn't even speak that language
The one language is basically a derivative of the other. Understanding and judging accuracy of translation are quite different though.
eg for me it’s the reverse - can understand large parts of Dutch due to Afrikaans. Couldn’t tell you if a Dutch sentence is correctly translated or grammatically correct though
> I can't use it because I'm not classified as "human" by a computer.
It uses Cloudflare Turnstile captcha.
The service shows no captcha to logged in Kagi users, so you can just create a (trial) Kagi account.
Thanks, but I am logged in and it still shows that. Clicking log in at the top of the page leads me to the login page which takes about 10 seconds to (while I'm typing) realise that I'm already logged in and then redirects me to the homepage (kagi search)
I don't have any site-specific settings and clearly HN works fine (as well as other sites) so it's not that cookies are disabled or such
Edit: come to think of it, I'm surprised that you find translator data to be more sensitive (worth sticking behind a gatekeeper) than user logins. Must have been a lot of work to develop this intellectual property. There is no Cloudflare check on the login page. Not that I'd want to give you ideas, though! :-)
> come to think of it, I'm surprised that you find translator data to be more sensitive (worth sticking behind a gatekeeper) than user logins. Must have been a lot of work to develop this intellectual property. There is no Cloudflare check on the login page.
This is just a simple anti-bot measure so we do not get hammered by them to death (kagi does not have an infinite treasure chest). It is not needed for search, because you can not use search for free anyway.
I see, that makes sense!
Zalgorythmz on crack. Brainz out of whack.
(cf. https://en.wikipedia.org/wiki/Zalgo_text https://en.wikipedia.org/wiki/Zombie https://en.wikipedia.org/wiki/Crack_cocaine )
Interesting. Never had such an issue with Google. How do they do it?
By not having a puzzle-less/unsolvable captcha, or what do you mean?
Yes, exactly they don’t have it. Of course there must be a reason for that. Any other tech for example?
Disclaimer: I am already a Kagi customer.
At least for Afrikaans I'm not impressed here. There are some inaccuracies, like "varktone" becoming "pork rinds" instead of "pig toes" and also some censorship ("jou ma se poes" does NOT mean "play with my cat"!). Comparing directly against Google Translate, Google nails everything I threw at it.
I didn't see any option to provide feedback, suggested translations, etc, but I'm hopeful that this service improves.
Just tried translating your comment to German. Kagi took a very literal approach, keeping sentence structure and word choice mostly the same. Google Translate and DeepL both went for more idiomatic translations.
However translating some other comments from this thread, there are cases where Kagi outperforms others on correctness. For example one comment below talks about "encountering multiple second page loads". Google Translate misunderstands this as "encountering a second page load multiple times" while DeepL and Kagi both get it right with "encountering page loads of multiple seconds" (with DeepL choosing a slightly more idiomatic wording)
This is the link they gave for feedback: https://kagifeedback.org/d/5305-kagi-translate-feedback/4
I asked some inappropriate things and it was "translated" to I cannot assist with that request. It definitely needs to be more clear when it's refusing to translate. But, then again, I don't even use kagi.
Maybe they are using Claude API for the translation, Claude models are really good multilingual models.
EDIT: the "Limitations" section report the use of LLMs without specifying the models used.
> some censorship
How the hell is everyone okay with it?
Why should I be "forbidden" from understanding a text written in a foreign language if it contains something inappropriate?
I don't know of any translation service that does censorship.
Not even Google does it and Kagi was supposed to be less user hostile than Google, not more.
This is kind of natural for a new service, one of the advantages the big players have is a giant test corpus. For less mainstream languages and terms it will be more noticeable.
"The game is my poem" when back-translated from the Turkish translation, "oyun benim şiirimdir". And there's censorship too when doing EN-TR for a few other profanities I tested. When you add another particular word to the sentence, it outputs "play with my cat, dad".
Just as a quick usability feedback: As long as Deepl translates asynchronously as I type, while Kagi requires a full form send & page refresh, I am not inclined to switch (translation quality is also already too good for my language pairs to consider switching for minor improvements, but the usability/ speed is the real feature here).
This is coming from a user with existing Kagi Ultimate subscription, so I'm generally very open to adopt another tool if it fits my needs).
Slightly offtopic, slight related: As already mentioned the last time Kagi hit the HN front page when I saw it: the best improvement I could envision for kagi is improved search performance (page speed). I still encounter multiple second page loads far too frequently that I didn't notice with other search engines.
Interesting, I'm actually annoyed that DeepL sends every keystroke and I'm using idk how many resources on their end when I'm just interested in the result at the end and for DeepL to receive the final version I want to share with them
That it's fast, you don't have to wait much between finishing typing and the result being ready, that's great and probably better than any form system is likely to be. But if it could be a simple enter press and then async loading the result, that sounds great to me
> As long as Deepl translates asynchronously as I type, while Kagi requires a full form send & page refresh,
This leads to increased cost and we wanted to keep service free. But yes we will introduce translate as your type (will be limited to paid Kagi members).
I uninstalled the DeepL extension because it would load all its assets (fonts etc) into every. single. page. No matter the host.
Unacceptable.
This will be a paid feature apparently: https://kagifeedback.org/d/5305-kagi-translate-feedback/9
Kagi develops lots of features, but they seem to often be quarter-baked.
Maps for example is basically unusable and has been for a while. (at least in Germany)
Trying to search for an address often leads Kagi maps to go to a different random address.
Still love the search, but Id love for Kagi to concentrate on one thing at a time.
We are focusing most our resources on search (which I hope you can agree, we are doing a pretty good job at). And it turns out search is not enough and you need other things - like maps (or a browser, because some browsers will not let you change search engine and our paid users can not use the service). Both are also incredibly hard to do right. If it appears quarter-baked (and I am first to say that we can and will definetely keep imporivng improving with our products), it is not for the lack of trying or ambition but the lack of resources. Kagi is 100% user-funded. So we need users, and we sometimes work on tools that do not bring us money directly, but bring us users (like Small Web, Universal Summarizer or Translate). It is all part of the plan. And it is a decade-long plan.
I absolutely did not mean to imply that you did not want to improve the products.
I did assume that you are missing the resources for the many products you develop.
Its just very sad to show/recommend Kagi to people and then have them(or me) run into so many bugs, and sometimes product-breaking bugs. (such as Maps that I mentioned. because I would love to use Kagi maps, but its so broken that I just cant)
Would love to travel 10 years into the future of Kagi's roadmap.
You do not need your own browser. Keep Firefox alive.
Mozilla is digging a grave for Firefox with its own hands, we need something else non-chromium, which Orion serves fine. I just wish they'd focus on fewer things as has already been mentioned, rather than producing many half-baked things.
I have high hopes for ladybird being our savior in this regard.
I beg to differ. The web does not need more ad-supported browsers. Orion is built for the users and paid for by the users.
It stands against all that is wrong with the web today - advertisers and third parties tracking at every step - perhaps like Firefox used to do 15-20 years ago.
We need more of Orion today, and more than ever.
Thank you for making Firefox plugins available for iOS.
I pay for search, but not Orion. I assume you see that much of my paid Kagi search traffic comes from Orion, and put money into appropriate buckets.
I think the community needs more browser options, sure, but I don't understand why Kagi needs Orion. It seems like a distraction to your core competencies. I'm curious if the writing is on the wall for search alone?
love the plan, but i’d suggest being more up front with users on how “finished” a product is.
With the maps example, you run into problems because of expectations. If you slap a BETA or ALPHA logo on the maps product, expectations will be lower, and people are more forgiving of issues while you continue improving the product. Or if it’s only good in the US (just an example), make it clear somehow when searching for addresses outside the US.
Just my 2 cents as a paying Kagi customer.
Interestingly we do not get a lot of bug reports for Maps on our feedback forum. And this is where we tend to go to look for problems to fix.
As a paid kagi user, that might be because I tried kagi maps once, went "yep that's crap" and from then on went to Google for any maps related search.
However, I neither expect nor need kagi to have a perfect replacement for every single google product. I'd rather it focus on creating better versions of the things that google is bad at (especially basic search) rather than trying to provide bad versions of the things Google is good at (maps, translate).
As a kagi early adopter… why would I bug report on a feature I actively avoid using?
I can totally recommend search to anyone, but I agree with others in this chat that most toys feel beta. I’m glad to have them but can’t recommend them.
For maps, your goals of being ad free go against what I need from maps search. 90% of the time I search for restaurants, museums, businesses, opening hours, phone numbers of various local shops. People add that data to google, and not that many other maps services :(. That is where they advertise how to be contacted. Addresses and directions are really secondary to a maps search.
Like others here are saying, Kagi maps is so far behind that I wouldn't bother with any bug reports or feedback. I tried it just now, was panning around in a region in Europe, clicked the "Hotels" button to see what it would present and get sent to a town called Hotels in Palestina with a Wikipedia description of what a hotel is...
So the suggestion to slap a beta sticker over maps is a good one. Nokia, Microsoft and Apple have all tried to compete with Google Maps without succeeding. Do yourself a favour and start using the Google Maps API for Kagi Maps, that's probably the only way you can get all the important data. If the API is expensive, then charge more for maps. Kagi customers want the best product, and are willing to pay for it.
As an anecdatapoint, I have replaced the button with a redirect to Google maps. It's not worth trying to extract value from the Kagi one, I probably gave it a chance ~20 times and I don't think it did what I needed a single one. (In Scandinavia)
As a paying Kagi user, in my case, that's because I !g any query that I expect to give local results for, and I often go directly to maps.google.com for map results. The general search results are awesome, particularly in my tech bubble, since I don't have to see w3schools garbage and the like. Localized stuff, not so great, and maps I prefer to avoid.
(Edited to add: Though perhaps I should give maps a try again. They seem to have gotten better since I formed my muscle memory.)
Where do I find the map feature?
I'm curious to see if I can identify what data source and search software it is based on, since I've heard similar complaints about Nominatim and it is indeed finicky if you made a typo or don't know the exact address; it does no context search based on the current view afaik. Google really does do search well compared to the open source software I'm partial to, I gotta give them that
Edit: ah if you horizontally scroll on the homepage there's a "search maps" thing. Putting in a street name near me that's unique in the world, it comes up with a lookalike name in another country. Definitely not any OpenStreetMap-based product I know of then, they usually aren't unliteral like that. Since the background map is Apple by default, I guess that's what the search is as well
It’s in Search. It’s one of the types of search you can perform. Below the search input is a bar with “Images”, “Videos”, “News”, and “Maps”.
Can also be found here:
https://kagi.com/maps
> I'm curious to see if I can identify what data source and search software it is based on
Kagi uses Apple Maps
>Quality ratings based on internal testing and user feedback
I'd be interested in knowing more about the methodology here. People who use Kagi tend to love Kagi, so bias would certainly get in the way if not controlled for. How rigorous was the quality-rating process? How big of a difference is there between "Average", "High" and "Very High"?
I'm also curious to the 1 additional language that Kagi supports (Google is listed at 243, Kagi at 244)?
>Kagi Translate is free for everyone.
That's nice!
A quick scrape of the two sites gives (literally a diff of sets of the strings used in language selection),
In Kagi, not Google:
In Google, not Kagi: They really must have copied Google, because like I said this was diffing exact strings, meaning that slight variations of how the languages are presented don't exist.> I'm also curious to the 1 additional language that Kagi supports (Google is listed at 243, Kagi at 244)?
I just copied all of the values from the select element on the page (https://translate.kagi.com/) and there's only 243. Now I genuinely wonder if it's Pig Latin. https://news.ycombinator.com/item?id=42080562
Also, notable, Google claims to support Inuktut and Tshiluba, and I don't see those two in Kagi.
I am very suspicious of the results. A few months ago they published a LLM benchmark, calling it "perfect" while it actually contained like only 50 inputs (academic benchmark datasets usually contain tens of thousands of inputs).
I recently noticed that Google Translate and Bing have trouble translating the German word "Orgel" ("organ", as in "church organ", not as in "internal organs") to various languages such as Vietnamese or Hebrew. In several attempts, they would translate the word to an equivalent of "internal organs" even though the German word is, unlike the English "organ", unambiguous.
Kagi Translate seems to do a better job here. It correctly translates "Orgel" to "đàn organ" (Vietnamese) and "עוגב" (Hebrew).
Google Translate often translates words through English.
DeepL also, for the record (since it's being compared in the submission)
It's pretty clear if you use the words out of context and they're true friends but it gets you the German translation of the English translation of whatever Dutch thing you put in. I also heard somewhere, perhaps when interviewing with DeepL, that they were working towards / close to not needing to do that anymore, but so far no dice that I've noticed and it has been a few years
The dearth of Lojban and Ithkuil texts holds back machine translation, for they would be perfect intermediate languages.
</ha-ha-only-serious>
If you write the input in Pig Latin, Kagi detects it as English but translates it correctly.
Bing detects it as English but leave it unchanged.
Google detects it as Telegu and gives a garbage translation.
ChatGPT detects it as Pig Latin and translates it correctly.
Looks like the page translator wants to use an iframe, so of course the x-frame-options header of that page will be the limiting factor.
> To protect your security, note.com will not allow Firefox to display the page if another site has embedded it. To see this page, you need to open it in a new window.
This is a super common setting and it's why I use a browser extension instead.
I could be missing something, but is there some sort of metric for these comparisons to other software? Like the BLEU score which I've seen in studies relating to comparing LLMs to Google Translate. I find it difficult to believe it is better than DeepL in a vacuum.
+1
I'm also interested in the benchmarks they've use, if any.
Has anyone seen info on how this works? "It’s not revolutionary" seems like an understatement when you can do better then DeepL and support more languages then google?
It just uses LLMs, I've had it output a refusal in the target language by entering stuff about nukes in the input
I'm pretty sure it's just a finetuned LLM.
I have some experience experimenting in this space; it's not actually that hard to build a model which surpasses DeepL, and the wide language support is just a consequence of using an LLM trained on the whole Internet, so the model picks up the ability to use a bunch of languages.
I'm almost sure they did not find tune an LLM. They are using existing LLMs because fine tuning to best the SOTA models at translation is impractical unless you target very niche languages and even then it would be very hard to get a better dataset than what is already used for those models.
Probably all they are doing is like switching between some Qwen model (for Chinese) and large Llama or maybe OpenAI or Gemini.
So they just have a step (maybe also an LLM) to guess which model is best or needed for the input. Maybe something really short and simple just goes to a smaller simpler less expensive model.
It uses a combination of LLMs, selecting the best output. (from the blog post)
Ah, I missed that. Thank you!
Kudos in the launch! Looking good!
One benefit of Google Translate is with languages like Hebrew and Arabic, you can enter in those languages phonetically or with on-screen keyboards.
It would be very handy if you allow putting text in query string for translate.
Example for google translate: https://translate.google.com/?text=%s
EDIT: You already supported it! Nice, didn't see it mentioned anywhere. This works: https://translate.kagi.com/?text=%s
I wonder how do we set language in query string? And can we execute search immediately on visit instead of needing to hit the Translate button?
https://kagifeedback.org/d/5305-kagi-translate-feedback/22
I would love to see an API to compete with DeepL.
very impressed with how well it handles Romanized Arabic
https://en.wikipedia.org/wiki/Romanization_of_Arabic
This is good. I wish it handled you-singular vs. you-polite-plural though.
It would be nice to say "use a casual tone". Or "the speaker is a woman and the recipient is a man".
How is this compared to using gpt-4 directly?
It varies depending on the language but I find GPT4o to be good into knowing the context and go sometimes with the intent not just the grammar and rules of the language. But for most cases it is an overkill and you still have the chance of hallucination (although it has less occurrence chances in these use cases)
This is of course based on my experience using it between Arabic, English and French which is among the 5 most popular languages. Things might be dramatically different with other languages.
Have you compared gpt-4o to Kagi?
They might actually be the same thing in some cases.
Not yet, I just knew about Kagi translate now.
I don't know how the translation quality compares, but the advantages to this would be that it's free and it can translate web pages in-place.
And presumably the energy efficiency of a dedicated translator compared to a generic language system, assuming they didn't build this on top of a GPT. The blog post doesn't say but I'm assuming (perhaps that's no longer accurate) that it's prohibitively expensive for a small team without huge funding to build such a model as a side project
ChatGPT does better -- it picks up context and produces more idiomatic output.
Kagi translate does pick up context (for instance, "Anne is older than her sister Carmen" is a good test for languages that have different words for older and younger sister -- Google Translate gets this wrong all the time).
But the Kagi output is stilted and grammatically incorrect for say, Cantonese.
On the sidenote, does anyone know what methods Yandex use? When I'm trying to translate Kyrgyz, Google Translate shows utter trash that makes less sense than if they chose words at random. Yandex in comparison is very impressive. What's the secret ingredient?
There is no secret ingredient, Yandex just put more effort into supporting languages that are spoken in ex-USSR countries, because that's the most important market for them.
Other translation tools do not consider i. e. Kyrgyzstan an important market therefore do not put much effort into supporting Kyrgyz.
I wish Kagi would focus on search rather than all these side projects trying to be Google.
What do you miss in Kagi Search?
Yes! This is what I found missing with Kagi. Great! I will test this out soon.
Added to my list, very nice.
One thing I like about google translate that nether deepl or this do is tell me how to say the word. I mainly use it to add a reading hint to an otherwise opaque japanese title in a database.
This is silly, and very tedious, to "Verify you are human" after each translation. How likely is that after few translations I might become a robot?
Any plans to include that directly into android app? Feels fairly obvious to me as a user - I went there automatically after the news dropped, but.... not there yet ;)
How would you want to see that integrated?
Nothing fancy.
Two text fields with send button. Accessible from hamburger menu. Or as another button below search entry box. Search box transforms into the first language text field.
Are there any small LLMs intended for translation between two languages? It would be much more convenient to use offline, albeit being slower.
> Limitations
> We do not translate dynamically created content ...
What does that mean?
I assume it means they only translate what's in the HTML, not anything that's added via Javascript later.
Is that a relevant username, or is J your initial? I can't quite place what "JavaScript heard" would mean. I've wondered before but there's no contact in your profile and now it felt at least somewhat related to the comment itself, sorry for being mostly off-topic
It's an initial :p
Mystery solved! Thanks for obliging my curiosity :)
Indeed, that's what would make most sense to me.
I also strongly suspect the way they're able to make it free is by caching the results, so each translation only happens one time regardless of how many requests for the page happen. If they translated dynamic content, they couldn't (safely) cache the results.
I don't think JS vs HTML would make any difference to caching.
If they are caching by URL you can have dynamic HTML generation or a JS generated page that is the same on every load.
If you are caching by the text then you can do the same for HTML or JS generated (you are just reading the text out of the DOM when the JS seems done).
Yeah, js can be static or dynamic, so its not just whether it's js that matters. It's whether the content is added or modified after initial rendering that makes it dynamic.
Most js heavy pages retrieve data from APIs, and the static parts of the code is just layout and menus, which isn't the part that people care most about translating. Thus why GP said "added via Javascript later." The important part of that isn't the "Javascript" , it's the "later."
Ah, that makes sense. In my head it sounded like server-side dynamic content OR not wanting to translate LLM outputs, neither of which makes sense or is possible.
that's what I think too, which kinda makes sense since it's a page, and not a browser plugin. If they implemented a browser plugin that would do what Google recently removed from their plugin, that would be a killer feature. (assuming they can then translate all html as it comes in)
Brave browser does it already though, but sometimes it's unusably slow.
I would guess it's only able to translate the html content sent on page load - so static webpages, but not SPAs etc.
I doubt it is better than deepl or google. On some tests it couldn't recognize the correct language.
That's odd. Clicking the switch languages icon swaps the languages but not the texts.
I find it useless without an option to add context to the text I want to translate.
What do you mean? Does any other translator have such a separate field that you could point to, or could you explain what you're missing?
When I want to give DeepL context, I just write it in the translation field (also, because it's exceptionally bad at single word translations, I do it even if the word should be unambiguous), so not type in "Katze" but "die Katze schnurrt" (the cat purrs). Is that the kind of thing you mean?
Of pre-LLM major online translators, Bing is the only one I've noticed that has a dropdown that offers standard/casual/formal, and produces multiple outputs if it detects masculine/feminine differences.
For LLM-based translators, it usually works if you add relevant details in parentheses.
Ooh, that sort! DeepL has a formality setting also, but outputting multiple options sounds great and isn't something I've seen (I've used Bing Translator before, maybe I forgot or maybe this language didn't support it)
Has anyone else noticed that Google Translate trips up a lot on GDPR cookie consent dialogs in Europe? I’ve often had to copy/paste the content of a web page because Google, when given the URL,couldn’t navigate past the dialog to get to the page content (or couldn’t allow me to dismiss it). Not sure if Kagi has solved this.
Some bugs to iron out
"Document Too Long Document is too long to process. It contains 158 chunks, but the maximum is 256. Please try again later or contact support if the problem persists."
Fixed, thanks for reporting.