For anyone who didn't click through to the WebKit bug report the author submitted, a WebKit dev asked him to clarify why the BBC finds it beneficial to be able to detect that the event was sent from a keyboard. This is the author's response:
> Ironically, I want interoperability on this to help with use cases relating to accessibility.
> I work at the BBC and, on our UK website, our navigation bar menu button behaves slightly differently depending on if it is opened with a pointer or keyboard. The click event will always open the menu, but:
> - when opening with a pointer, the focus moves to the menu container.
> - when opening with a keyboard, there is no animation to open the menu and the focus moves to the first link in the menu.
> Often when opening a menu, we don't want a slightly different behaviour around focus and animations depending on if the user 'clicks' with a pointer or keyboard.
> The 'click' event is great when creating user experiences for keyboard users because it is device independent. On keyboards, it is only invoked by Space or Enter key presses. If we were to use the keydown event, we would have to check whether only the the Space or Enter keys were pressed.
This is fascinating, because the naive English interpretation of the code and the comment on that WebKit bug don't match the actual structure of the code. Here's the relevant code:
Ignoring the actual conditions entirely, this code seems to be trying to categorize the event into one of two categories: mouse or keyboard. But what it actually does is to categorize into one of four categories: (mouse and not keyboard), (keyboard and not mouse), (keyboard and mouse), and (neither keyboard nor mouse). And, as the original bug shows, (neither keyboard nor mouse) is handled inappropriately. One might wonder whether (keyboard and mouse) works well.
Either the code should be deliberate about the fact that (is it a keyboard) and (is it a mouse) are separate booleans, or the code should be structured so that the actual categories are mutually exclusive. For example:
const isInvokedByMouse = ...
and use !isInvokedByMouse to check for keyboardiness, or:
const eventSource = ... (returns "keyboard" or "mouse")
or, perhaps even better:
const eventSource = ... (returns "keyboard", "mouse", or "not sure")
This is a great comment ^ whenever you use two booleans like this, you're opening yourself up to "unrepresentable state" logic errors. Finding and noticing this in code can be tricky, but it's a great example of something that should be caught during code review.
The first mistake the developer made, was that he wanted to create a different user experience between keyboard and mouse. Stick to what you get by default and design your components so they work for both usecases. Don't try to be smart when it comes to accessibility.
What he ended up doing is what I would have considered a hack. A solution that inevitably breaks or has side effects.
The reason there rarely are good handles to do things differently in accessibility context, is because it's not something that's meant to be handled differently.
See I work in accessibility. Like I provide and create solutions direct to end users with complex needs. Not regular web accessibility. I get the view of this. It’s the same idea of universal access. But actually I don’t fully agree. Yes. If you can stick to this principle - and do try / but I promise you edge cases - which in itself is what accessibility users are all about - cause headaches. At some level you have to do custom stuff. It’s the best way. Take for example switch users. Yes. If your ui is tab able - great. But what if you need your items scannable in frequency order. Your tab index needs to change to meet the end users needs. Or eye gaze users. The accuracy level changes. Add in cognitive issues. You can’t just make a one size fits all interface. At some stage you need to significantly customize it. You can’t rely on a user just learning a complex system level interaction technique- if they can’t do that you have to customise on an individual level.
Of course there are edge cases, I work with accessibility too, for an app in the public sector where WCAG rules are no joke, so I know this as well but even so, we don't build custom accessibility UI for our users. We (try to) build the UI with accessibility in mind so it's scalable, can be used and navigate properly by voice over and keyboard.
On mobile it's not perfect either but in general you do have features to change stuff like. focus, grouping of elements, how the keyboard navigate the view stack, how to access a button through custom actions and like you mention, change the tab index programmatically.
Even so, not everything can be fixed or handled through standard accessibility means and as such hacks will inevitably make it into the products.
I get what you're saying but I still think that making things accessible and designing with common accessibility in mind should be default and as such it has to be thought about when designing and developing from the get go. Having to create custom interfaces to fulfill a specific need might be a good fit for some things but not when developing apps and websites unless you're targeting that user-group specifically.
Well said! It certainly applies to web development as well. Sadly, sometimes more complex solutions are needed - especially when based on user research.
> The first mistake the developer made, was that he wanted to create a different user experience between keyboard and mouse. Stick to what you get by default and design your components so they work for both usecases.
We have. The behaviour is mostly the same whether you're using the keyboard or a pointer (mouse/touch/pen). The only difference is that, for keyboard users, we want to turn off the animation and move the focus to the first link in the menu instead of focussing on the menu's parent <ul>.
The problem was that, as various devs have iterated on the menu over the years, it's broken the fallback behaviour. For my colleague on the funny multi-monitor set up, it should have fallen back to the keyboard no-animation behaviour with no real major difference to the UX, but instead it fell back to the no-JS experience.
So yes, generally don't try to be smart with accessibility, avoid ARIA attributes except where necessary, etc, but click events are the universal input event and work on any kind of input device and have perfect browser support. It's far better for accessibility using them instead of a mix of keydown and mousedown or pointerdown, and potentially missing other kinds of input events.
As I stated in another comment, if it was a scenario where there needs to be a major difference in behaviour between keyboard and pointers, then I would rather use separate keydown and pointerdown events.
The _mostly_ same behavior is what caused the problem though :P I'm curious, did these solutions come to pass because you had to make adjustments based on actual user feedback or was it just a developer trying to think ahead? I'm questioning whether forcing the user to tab to get to the menu item is a hindrance at all or whether the animation was a problem.
Maybe the former could have been solved using ARIA tags or maybe it would require bigger changes to the component itself. Accessibility is a roller-coaster for all these reasons alone.
The BBC site has a "search box" that's actually a button that brings up the real search box. Always feels confusing. At least it's consistent across News / Sounds / iPlayer.
> The first mistake the developer made, was that he wanted to create a different user experience between keyboard and mouse.
No, they wanted to make them the same. It's just to give a blind person the same experience as a seeing person requires different things because they operate differently for obvious reasons. For example, a blind person can't see when an animation has finished. They expect that menu to be available once they've triggered it. However, seeing people see the dropdown appearing and then go to use it once it's ready.
> Don't try to be smart when it comes to accessibility.
In all seriousness, considering the state of accessibility as is, I think going outside the box isn't trying to be smart. It's actually being smart. The BBC frontend team is probably at the forefront of making high-traffic websites extremely usable.
> For example, a blind person can't see when an animation has finished. They expect that menu to be available once they've triggered it. However, seeing people see the dropdown appearing and then go to use it once it's ready.
For my two-cents, the BBC was simply trying too much to be "cutesy". Don't animate anything, because the silly animation on mouse click simply makes the website feel slower overall. Just open the menu as fast as the user's browser will open it.
While I can understand the author's need for screenX and screenY, the question remains. Why would screenX return the real screenX position instead of the position within the renderer (I don't think that exists?) or the rendered page (layerX and layerY)? The author's need would be met the same with the renderer position and window positions wouldn't be leaked to all visited websites.
My understanding from this is that BBC want slightly different behaviour depending on whether it's a mouse or keyboard "click" (keyboard shouldn't show the animation and should focus the first link in the menu).
However, they also want the ease of binding to a single event and while binding to "click" can do this, they have no way to tell whether it was a mouse click or keyboard press which triggered the event.
To solve this they're using an unreliable heuristic after realising in Chrome if the mouse position is screenX=0, screenY=0 it means the event was either triggered by a mouse click at screenX=0, screenY=0 or a keyboard.
As someone whose worked on accessibility projects in the past, this is a really stupid idea imo, and had I reviewed a PR with something like this I would have asked it to be reworked. While I agree browsers should ideally do the same thing, the real issue here seems to me that screenX and screenY make little sense on "click" triggered by a keyboard.
The solution ideally would be a new event ("trigger" or something) which doesn't emit a "MouseEvent", but something more generic which could apply to both a keyboard and mouse event and provide information about the "trigger" source. Imo keyboard "clicks" are weird to begin with and would ideally be fixed with a more appropriate event.
That said, I understand this doesn't currently exist in the spec and a solution is needed now. Therefore I don't see why they couldn't also bind to a "keydown" event then if the click is triggered alongside the "keydown" on the same element, assume it was a keyboard press. That would be far more reliable and far less hacky than what they're doing, and would allow them to trigger from the single event with a bit of extra code to detect if it was a keyboard or mouse.
Hello I am the author and, yes, I totally agree some generic 'trigger' event would be far better.
To use the keydown event means we have to assume that the 'Enter' and 'Space' are the only keys we need to check for. Using 'click' is far safer from an accessibility point of view because it will always respect what your device considers to be some kind of input trigger.
As stated in the UI Events spec:
> For maximum accessibility, content authors are encouraged to use the click event type when defining activation behavior for custom controls, rather than other pointing-device event types such as mousedown or mouseup, which are more device-specific. Though the click event type has its origins in pointer devices (e.g., a mouse), subsequent implementation enhancements have extended it beyond that association, and it can be considered a device-independent event type for element activation.
And to be clear, I would not want to do it this way if it was for some really critical difference in behaviour between pointer or keyboard interactions. I'm OK with this strange mechanism here because the fallback behaviour is not that different. If you're on Safari, for example, which can't check for `screenX === 0`, then all that happens is that there will be an animation when you open the menu.
However, sadly, because of the ways various developers have added to this code over the years, it's broken that fallback behaviour and stopped it working entirely. So I've just finished a refactor to sort that out and it will hopefully be going live soon.
> Often when opening a menu, we don't want a slightly different behaviour around focus and animations depending on if the user 'clicks' with a pointer or keyboard.
Is the word “don’t” a mistake which gives the sentence the opposite of the intended meaning?
> All we had to do was change the isInvokedByMouse to check that screenX and screenY don't equal 0, rather than checking if they are greater than 0.
It's obviously extremely unlikely but what if the mouse is actually at 0,0 when the user clicks? I'm not very familiar with JS, is checking for != 0 really the best/only way to do this?
EDIT: actually upon going back, I realized I didn't fully process this sentence originally but it seems to address this:
> We should probably do further refactoring of the event handler function, since it's complicated by the fact that it also handles keydown events. For now, though, this fix will do just fine.
It seems that querying the screen position is just a heuristic they came up with to determine the nature of the event. Instinctively I would use instance of MouseEvent for this, but even this feels risky/hacky to me.
My question is why they're relying on those heuristics. My guess is that toggleMenu is being used by multiple event handlers. Or maybe there's something else going on that is specific to their codebase.
It's hard to judge without knowing the full picture.
Maybe browsers will report `click` events that aren't actually created by a pointer device (maybe a screen reader or something?). But that still raises the question of why you would care. It seems to me like if the platform wants to report it as a `click`, your app should treat it as one and not try to get "clever" about it.
For compatibility with the Web content, the `click` event has become a device-independent activation event. Sites can't be expected to listen for events from every kind of device, so instead all devices send `click`s.
They care, because focus for keyboard-controlled screen readers sending "click" should behave differently: an element inside the menu should receive focus, even though it's not the element that has been clicked. Otherwise if focus stayed on top-level menu bar, screen reader users would be lost, and had to navigate to menu's content themselves.
No its not. You can do media select on if the primary input device is a pointer device (and, further, if it has high accuracy) and then filter on that.
I used it to select which layout to show in the past.
If you want to listen to input on touch only then you can do that and call preventDefault on the event so that the browser does not then cause a click event. Or you can just save yourself the trouble and write a click handler.
Kudos to BBC for investing in accessibility, and unfortunately discovering a nasty bug.
As an industry, why haven't we figured out how to make drop downs that consistently open for all users? Is accessibility just that hard? Are there web frameworks/web components BBC should be using that already handle this?
I've been wary (as a backend-focused full-stack developer) about tweaking the browsers components. There's so much nuance to how they work and the implementations are battle tested. The idea of creating a custom text box (for example) without doing extensive research of text box behavior across platforms seems ripe for failure. I notice broken copy/paste and dropped characters often enough (on major corporate sites too). Why are text boxes broken in 2024? React feels arrogant to me now.
Personally, I've tried to handle this with server-side templates, CSS frameworks like Bulma, minimal JS. It's not viable for sites demanding slick custom branding (vanity?) but my text boxes work and my site doesn't cost a fortune to develop. Is it accessible to BBC standards? I'm not sure.
Agreed. But ultimately many issues arise when user agents customize these elements in very dubious ways. It's ok for the most part, but there's a reason behind reset.css files, and I wager a more nuclear approach was used here to circumvent these issues completely.
I've searched for reasons and couldn't find much. The fact that a website can know where a browser window is located (window.screenX/window.screenY) and that clicks position can be reported in that coordinate system sounds insane for a desktop. TOR Browser seems to spoof screenX and screenY to avoid fingerprinting.
Has anyone seen good use-cases for that feature? I'm thinking about dual window applications that interacts with each other (I think I saw a demo of something like this a while ago on HN but I wasn't able to find it again), or sites where behavior depends on their location on the virtual screen.
Back in html 4 days we did this shenanigans all the time. I worked on very over the top sites that played with multiple windows talking to each other and moving in synchrony. I’ve tried looking for examples on archive.org (eg I know we did this a ton on flash heavy sites like design museum in London ) but alas the ones I was looking for a broken in that archive.
If you're reacting to click events, you might want to know the coordinates of where you're clicking. I mostly use this for click and drag stuff since you can get the delta between events and update position of the thing being dragged.
As for why they're checking for coordinates instead of checking for event.type is beyond me. Still I appreciate the write up, it is a good puzzle and relatable to come across code you didn't write and ask, why is it important that the click coordinate is nonzero? Why can't we just check that event.target is the button we want to activate? Why are we using JavaScript at all when a details/summary tag would do the same job?
Relative coordinates sure, but why would you need the absolute position?
I'm with you on the second point - as unlikely as it is for the click to occur at the origin, it's still a legitimate value being abused as an indicator of something that might not actually be true - quite frankly the code was bad to begin with, and it was still bad after the fix.
Uh. So they can keep track of what the user is doing?
Why would you just send a document when you can generate a heat map of where the user is on your website. And then complain about the performance and wonder why it costs so much to run a modern website.
The issue isn't so much the coordinate of the mouse within a page, but that the coordinates are relative to the virtual screen layout. It describes where your window is located on the screen(s) and the click is expressed in screen coordinates. Mapping those coordinates to your website's renderer coordinates requires additional calculation.
layerX[1] while non-standard is supported and returns a position relative to the top of the page or the top of the parent element. This makes coordinates positive only and 50,50 is the same for all users. For screenX, 3000,1567 is the same coordinate as 15,37 depending on where the window is located.
Why are you filtering for screen coordinates in the first place? What if the user is using e.g. an alternative input device that doesn't have a screen? The `click` event should be enough indication that the user has tried to activate the menu. Why reinvent the wheel?
> Why are you filtering for screen coordinates in the first place?
FTA:
> The isInvokedByMouse was checking whether the click event was invoked by a mouse or touch pointer – rather than a keyboard – by checking if the screenX or screenY coordinates were a positive number.
They were trying to detect whether it was keyboard or mouse activation, and whoever wrote it assumed that screen coordinates of mouse events would always be positive.
Right, but the article doesn't explain why they cared whether it was keyboard or mouse activation. The linked WebKit bug goes into more detail, but it's still lacking an explanation of why alternative, more common and widely deployed strategies (like having a capturing keyup event that triggers earlier in the render loop) wouldn't be a better idea instead.
Also, if you really want to determine whether a MouseEvent is "real" or "synthetic", and you don't want to worry about when mouse events are triggered relative to keyboard events in the event loop (although it doesn't seem very hard to keep track of), it seems like you can use the current click count (i.e., event.detail). This works on both Chrome and Safari—it's 1 for mouse clicks, and 0 for keyboard "clicks", but the spec text is also a little contradictory and under-specified: the "click" event handler says that "the attribute value MUST be 1 when the user begins this action and increments by 1 for each click" (https://w3c.github.io/uievents/#event-type-click) but it also says "This MUST be a non-negative integer indicating the number of consecutive clicks of a pointing device button within a specific time" (https://w3c.github.io/uievents/#current-click-count), and the definition of "pointing device button" seems to exclude synthetic keyboard events (since those are handled separately)
> They were trying to detect whether it was keyboard or mouse activation
But the code shown doesn't do different stuff for Keyboard vs Mouse, it just checks if it is either one of them. Why would you do that? Which other click event types are there that you want to filter?
I omitted that code to keep the article simpler. We want to turn off the animation on keyboard, and move the focus to different things depending on if it's a keyboard or pointer user.
What is the correct way to check if you have a mouse click rather than a keyboard click? I would be tempted to set a module level flag based on what most recently happened - if there was a "mousedown" event more recently we'll set isKeyboard to false and isMouse to true and vice-versa if "keydown" is pressed.
Then we wouldn't need the isInvokedByMouse and isInvokedByKeyboard functions.
Is there a better way? Relying on screen coordinates for this is highly dubious and I would argue a hack.
This seems like a self-inflicted bug resulted from incorrect heuristics (assumption that positive screenX/Y values represent mouse event), and the investigation was complicated by inadequate tracing/logging.
Instead of checking the more appropriate property that other commenters have suggested (pointerType), I'm a bit surprised that the solution given by the author is to patch up the shaky heuristics even more:
> We could deduce from our final two clues the solution: we need to check for negative numbers as well as positive numbers when checking the screenX and screenY coordinates.
Actually that is what we're going to do. I'm hoping to merging in code, soon, that will change it to use pointerId === -1 and then fall back to screenX === 0.
At the time this code was originally written four years ago or whenever it was, not all browsers used PointerEvent for click.
Checking for position != 0 still does not fix it. On Windows and Linux, it is possible for a window to span multiple displays. Someone could set things up so that a valid click target for the button ends up in the (0, 0) position.
I would not write a blog post advertizing how I filed a bug with WebKit for my own coding 101 error: unsafe assumptions and relying on side effects and heuristics.
Actually not just 101, it's basically with all of us at all levels and for life. So they're in good company having made a mistake everyone makes all the time, but it was a mistake on their part not a bug in WebKit, nore even a "interoperability issue" in WebKit or any browser.
They say they weren't aware that negative values were possible and that different browsers produce different values.
Ok, but neither of those matters.
If the function is even allowed to contain or express a negative value (IE right at the lowest basic level, is the returned data type actually a uint, or is it anything else? a regular int? a string?) then negetive values were always a possibility even if you personally never saw one before. Saying "I didn't expect a number below 0" is barely any different from saying "I didn't expect a number above 10000".
The discrepency between browsers doesn't matter and isn't the browsers fault that it tripped you up. You just made a standard boring unsafe assumption like every other programmer ever.
The entire problem is that you cared about something you don't actually care about.
You assumed that there was meaning in the absolute position of the window or the mouse pointer, when there never was, and you don't actually care about those anyway. The absolute position is like the actual internal-only row number in a db. Every row has a unique one, but it's none of your business what it is. There is only meaning in it's position relative to something else like a button, or relative to it's previous position to track movement.
Similarly checking for 0,0 and assuming that means keyboard is just another false heuristic that doesn't actually prove any such thing. The specs may or may not promise that the value will be 0,0 in the event of a keyboard initiated click, but no way it says that it can't be 0,0 any other way.
Don't de ashamed of this error because it's common, but don't be proud of calling these WebKit or browser interoperability bugs.
Do write up and publish the experience though as a warning and lesson to new developers about assumptions and heuristics and relying on side effects that just happen to work on the developers laptop when they tried it once and shipped it.
Also "it's for accessibility" doesn't change anything. Trying to be too smart just makes it worse. Actually that's true just generally for everything.
Why do you even have to check if screenX and screenY are non-zero (as opposed to just checking typeof event.screenX == "number")? Wouldn't that mean (and this is a wild edge-case) that if someone positioned their browser window so that the menu was in the top left corner (at position 0,0) the event handler would break again? Is this to block synthetic click events like (<div />).click()? Keyboard events don't have a screenX or screenY from what I remember as well.
This is just a heuristic to determine if the event is keydown or click.
In the article the author says that the issue is that the same function is handling both events, and they will work on refactoring it to something better.
The normal approach is just have different functions answering to different events. Or using more precise information about the event [1], instead of a heuristic.
That does seem quite obviously better. Even when insisting on checking coordinates why use the coordinate system carried by the event that you have the least control over. Why not .pageX/.pageY, which can't trigger the failure case as long as you make sure the element isn't in the far top left.
Based on the other replies here it seems like it's to differentiate taps vs mouse clicks, keyboard events in js don't have a screenX or screenY property (you can run this in your browser console on this HN post to confirm):
(() => {
const logEvent = event => console.log({
coords: [event.screenX, event.screenY],
type: event.type
});
const input = document.querySelector("textarea");
// use "keydown" instead of "keypress" to detect all keyboard input instead of just character producing input
input.addEventListener("keydown", logEvent);
input.addEventListener("click", logEvent);
})();
Type in or click on the reply text input and you'll see that the coords array is undefined for all keyboard events. I haven't tried this equivalent on a touch device however, so not sure how it's handled there.
I would say many people who browse Hacker News given the fact that, at the time of writing, this is the 4th ranked article. Moreover, who cares which company the person writing this post happened to be working for when discovering this interesting browser edge case?
For anyone who didn't click through to the WebKit bug report the author submitted, a WebKit dev asked him to clarify why the BBC finds it beneficial to be able to detect that the event was sent from a keyboard. This is the author's response:
> Ironically, I want interoperability on this to help with use cases relating to accessibility.
> I work at the BBC and, on our UK website, our navigation bar menu button behaves slightly differently depending on if it is opened with a pointer or keyboard. The click event will always open the menu, but:
> - when opening with a pointer, the focus moves to the menu container.
> - when opening with a keyboard, there is no animation to open the menu and the focus moves to the first link in the menu.
> Often when opening a menu, we don't want a slightly different behaviour around focus and animations depending on if the user 'clicks' with a pointer or keyboard.
> The 'click' event is great when creating user experiences for keyboard users because it is device independent. On keyboards, it is only invoked by Space or Enter key presses. If we were to use the keydown event, we would have to check whether only the the Space or Enter keys were pressed.
Source: https://bugs.webkit.org/show_bug.cgi?id=281430
This is fascinating, because the naive English interpretation of the code and the comment on that WebKit bug don't match the actual structure of the code. Here's the relevant code:
Ignoring the actual conditions entirely, this code seems to be trying to categorize the event into one of two categories: mouse or keyboard. But what it actually does is to categorize into one of four categories: (mouse and not keyboard), (keyboard and not mouse), (keyboard and mouse), and (neither keyboard nor mouse). And, as the original bug shows, (neither keyboard nor mouse) is handled inappropriately. One might wonder whether (keyboard and mouse) works well.Either the code should be deliberate about the fact that (is it a keyboard) and (is it a mouse) are separate booleans, or the code should be structured so that the actual categories are mutually exclusive. For example:
and use !isInvokedByMouse to check for keyboardiness, or: or, perhaps even better:This is a great comment ^ whenever you use two booleans like this, you're opening yourself up to "unrepresentable state" logic errors. Finding and noticing this in code can be tricky, but it's a great example of something that should be caught during code review.
Not sure if exactly the same thing but reminds me of "Booleans are a trap"
https://katafrakt.me/2024/11/09/booleans-are-a-trap/
Seems like a non bug to me.
The first mistake the developer made, was that he wanted to create a different user experience between keyboard and mouse. Stick to what you get by default and design your components so they work for both usecases. Don't try to be smart when it comes to accessibility.
What he ended up doing is what I would have considered a hack. A solution that inevitably breaks or has side effects.
The reason there rarely are good handles to do things differently in accessibility context, is because it's not something that's meant to be handled differently.
See I work in accessibility. Like I provide and create solutions direct to end users with complex needs. Not regular web accessibility. I get the view of this. It’s the same idea of universal access. But actually I don’t fully agree. Yes. If you can stick to this principle - and do try / but I promise you edge cases - which in itself is what accessibility users are all about - cause headaches. At some level you have to do custom stuff. It’s the best way. Take for example switch users. Yes. If your ui is tab able - great. But what if you need your items scannable in frequency order. Your tab index needs to change to meet the end users needs. Or eye gaze users. The accuracy level changes. Add in cognitive issues. You can’t just make a one size fits all interface. At some stage you need to significantly customize it. You can’t rely on a user just learning a complex system level interaction technique- if they can’t do that you have to customise on an individual level.
Of course there are edge cases, I work with accessibility too, for an app in the public sector where WCAG rules are no joke, so I know this as well but even so, we don't build custom accessibility UI for our users. We (try to) build the UI with accessibility in mind so it's scalable, can be used and navigate properly by voice over and keyboard.
On mobile it's not perfect either but in general you do have features to change stuff like. focus, grouping of elements, how the keyboard navigate the view stack, how to access a button through custom actions and like you mention, change the tab index programmatically.
Even so, not everything can be fixed or handled through standard accessibility means and as such hacks will inevitably make it into the products.
I get what you're saying but I still think that making things accessible and designing with common accessibility in mind should be default and as such it has to be thought about when designing and developing from the get go. Having to create custom interfaces to fulfill a specific need might be a good fit for some things but not when developing apps and websites unless you're targeting that user-group specifically.
Well said! It certainly applies to web development as well. Sadly, sometimes more complex solutions are needed - especially when based on user research.
Also note, it’s been about 10-15 years since the rules changed and if you want to work on a government contract, accessibility is mandatory.
I am the author.
> The first mistake the developer made, was that he wanted to create a different user experience between keyboard and mouse. Stick to what you get by default and design your components so they work for both usecases.
We have. The behaviour is mostly the same whether you're using the keyboard or a pointer (mouse/touch/pen). The only difference is that, for keyboard users, we want to turn off the animation and move the focus to the first link in the menu instead of focussing on the menu's parent <ul>.
The problem was that, as various devs have iterated on the menu over the years, it's broken the fallback behaviour. For my colleague on the funny multi-monitor set up, it should have fallen back to the keyboard no-animation behaviour with no real major difference to the UX, but instead it fell back to the no-JS experience.
So yes, generally don't try to be smart with accessibility, avoid ARIA attributes except where necessary, etc, but click events are the universal input event and work on any kind of input device and have perfect browser support. It's far better for accessibility using them instead of a mix of keydown and mousedown or pointerdown, and potentially missing other kinds of input events.
As I stated in another comment, if it was a scenario where there needs to be a major difference in behaviour between keyboard and pointers, then I would rather use separate keydown and pointerdown events.
The _mostly_ same behavior is what caused the problem though :P I'm curious, did these solutions come to pass because you had to make adjustments based on actual user feedback or was it just a developer trying to think ahead? I'm questioning whether forcing the user to tab to get to the menu item is a hindrance at all or whether the animation was a problem.
Maybe the former could have been solved using ARIA tags or maybe it would require bigger changes to the component itself. Accessibility is a roller-coaster for all these reasons alone.
> Don't try to be smart when it comes to accessibility.
“Don't try to be smart” alone is good advice in general and everywhere. Also in UI “don’t try to be original”
The BBC site has a "search box" that's actually a button that brings up the real search box. Always feels confusing. At least it's consistent across News / Sounds / iPlayer.
> The first mistake the developer made, was that he wanted to create a different user experience between keyboard and mouse.
No, they wanted to make them the same. It's just to give a blind person the same experience as a seeing person requires different things because they operate differently for obvious reasons. For example, a blind person can't see when an animation has finished. They expect that menu to be available once they've triggered it. However, seeing people see the dropdown appearing and then go to use it once it's ready.
> Don't try to be smart when it comes to accessibility.
In all seriousness, considering the state of accessibility as is, I think going outside the box isn't trying to be smart. It's actually being smart. The BBC frontend team is probably at the forefront of making high-traffic websites extremely usable.
> For example, a blind person can't see when an animation has finished. They expect that menu to be available once they've triggered it. However, seeing people see the dropdown appearing and then go to use it once it's ready.
For my two-cents, the BBC was simply trying too much to be "cutesy". Don't animate anything, because the silly animation on mouse click simply makes the website feel slower overall. Just open the menu as fast as the user's browser will open it.
While I can understand the author's need for screenX and screenY, the question remains. Why would screenX return the real screenX position instead of the position within the renderer (I don't think that exists?) or the rendered page (layerX and layerY)? The author's need would be met the same with the renderer position and window positions wouldn't be leaked to all visited websites.
Does anyone else find this write up confusing?
My understanding from this is that BBC want slightly different behaviour depending on whether it's a mouse or keyboard "click" (keyboard shouldn't show the animation and should focus the first link in the menu).
However, they also want the ease of binding to a single event and while binding to "click" can do this, they have no way to tell whether it was a mouse click or keyboard press which triggered the event.
To solve this they're using an unreliable heuristic after realising in Chrome if the mouse position is screenX=0, screenY=0 it means the event was either triggered by a mouse click at screenX=0, screenY=0 or a keyboard.
As someone whose worked on accessibility projects in the past, this is a really stupid idea imo, and had I reviewed a PR with something like this I would have asked it to be reworked. While I agree browsers should ideally do the same thing, the real issue here seems to me that screenX and screenY make little sense on "click" triggered by a keyboard.
The solution ideally would be a new event ("trigger" or something) which doesn't emit a "MouseEvent", but something more generic which could apply to both a keyboard and mouse event and provide information about the "trigger" source. Imo keyboard "clicks" are weird to begin with and would ideally be fixed with a more appropriate event.
That said, I understand this doesn't currently exist in the spec and a solution is needed now. Therefore I don't see why they couldn't also bind to a "keydown" event then if the click is triggered alongside the "keydown" on the same element, assume it was a keyboard press. That would be far more reliable and far less hacky than what they're doing, and would allow them to trigger from the single event with a bit of extra code to detect if it was a keyboard or mouse.
Hello I am the author and, yes, I totally agree some generic 'trigger' event would be far better.
To use the keydown event means we have to assume that the 'Enter' and 'Space' are the only keys we need to check for. Using 'click' is far safer from an accessibility point of view because it will always respect what your device considers to be some kind of input trigger.
As stated in the UI Events spec:
> For maximum accessibility, content authors are encouraged to use the click event type when defining activation behavior for custom controls, rather than other pointing-device event types such as mousedown or mouseup, which are more device-specific. Though the click event type has its origins in pointer devices (e.g., a mouse), subsequent implementation enhancements have extended it beyond that association, and it can be considered a device-independent event type for element activation.
And to be clear, I would not want to do it this way if it was for some really critical difference in behaviour between pointer or keyboard interactions. I'm OK with this strange mechanism here because the fallback behaviour is not that different. If you're on Safari, for example, which can't check for `screenX === 0`, then all that happens is that there will be an animation when you open the menu.
However, sadly, because of the ways various developers have added to this code over the years, it's broken that fallback behaviour and stopped it working entirely. So I've just finished a refactor to sort that out and it will hopefully be going live soon.
> Often when opening a menu, we don't want a slightly different behaviour around focus and animations depending on if the user 'clicks' with a pointer or keyboard.
Is the word “don’t” a mistake which gives the sentence the opposite of the intended meaning?
Hello I am the author and that was indeed a mistake. Whoops!
> All we had to do was change the isInvokedByMouse to check that screenX and screenY don't equal 0, rather than checking if they are greater than 0.
It's obviously extremely unlikely but what if the mouse is actually at 0,0 when the user clicks? I'm not very familiar with JS, is checking for != 0 really the best/only way to do this?
EDIT: actually upon going back, I realized I didn't fully process this sentence originally but it seems to address this:
> We should probably do further refactoring of the event handler function, since it's complicated by the fact that it also handles keydown events. For now, though, this fix will do just fine.
It seems that querying the screen position is just a heuristic they came up with to determine the nature of the event. Instinctively I would use instance of MouseEvent for this, but even this feels risky/hacky to me.
My question is why they're relying on those heuristics. My guess is that toggleMenu is being used by multiple event handlers. Or maybe there's something else going on that is specific to their codebase.
It's hard to judge without knowing the full picture.
EDIT: Aha, there's an answer here: https://news.ycombinator.com/item?id=42174177
But they're already checking for event.name == 'click' in the revised code. So why would you want to filter out some legitimate click events?
Maybe browsers will report `click` events that aren't actually created by a pointer device (maybe a screen reader or something?). But that still raises the question of why you would care. It seems to me like if the platform wants to report it as a `click`, your app should treat it as one and not try to get "clever" about it.
For compatibility with the Web content, the `click` event has become a device-independent activation event. Sites can't be expected to listen for events from every kind of device, so instead all devices send `click`s.
They care, because focus for keyboard-controlled screen readers sending "click" should behave differently: an element inside the menu should receive focus, even though it's not the element that has been clicked. Otherwise if focus stayed on top-level menu bar, screen reader users would be lost, and had to navigate to menu's content themselves.
Interesting. Seems like something that should be exposed more explicitly.
Apply Chesterton's Fence principle and assume there are (hopefully) comments in the real code around why this has been put in place
No its not. You can do media select on if the primary input device is a pointer device (and, further, if it has high accuracy) and then filter on that.
I used it to select which layout to show in the past.
If you want to listen to input on touch only then you can do that and call preventDefault on the event so that the browser does not then cause a click event. Or you can just save yourself the trouble and write a click handler.
Kudos to BBC for investing in accessibility, and unfortunately discovering a nasty bug.
As an industry, why haven't we figured out how to make drop downs that consistently open for all users? Is accessibility just that hard? Are there web frameworks/web components BBC should be using that already handle this?
I've been wary (as a backend-focused full-stack developer) about tweaking the browsers components. There's so much nuance to how they work and the implementations are battle tested. The idea of creating a custom text box (for example) without doing extensive research of text box behavior across platforms seems ripe for failure. I notice broken copy/paste and dropped characters often enough (on major corporate sites too). Why are text boxes broken in 2024? React feels arrogant to me now.
Personally, I've tried to handle this with server-side templates, CSS frameworks like Bulma, minimal JS. It's not viable for sites demanding slick custom branding (vanity?) but my text boxes work and my site doesn't cost a fortune to develop. Is it accessible to BBC standards? I'm not sure.
Agreed. But ultimately many issues arise when user agents customize these elements in very dubious ways. It's ok for the most part, but there's a reason behind reset.css files, and I wager a more nuclear approach was used here to circumvent these issues completely.
I'm just trying to reason on their decision here.
Why are websites getting mouse position in screen coordinates in the first place?
I've searched for reasons and couldn't find much. The fact that a website can know where a browser window is located (window.screenX/window.screenY) and that clicks position can be reported in that coordinate system sounds insane for a desktop. TOR Browser seems to spoof screenX and screenY to avoid fingerprinting.
Has anyone seen good use-cases for that feature? I'm thinking about dual window applications that interacts with each other (I think I saw a demo of something like this a while ago on HN but I wasn't able to find it again), or sites where behavior depends on their location on the virtual screen.
Back in html 4 days we did this shenanigans all the time. I worked on very over the top sites that played with multiple windows talking to each other and moving in synchrony. I’ve tried looking for examples on archive.org (eg I know we did this a ton on flash heavy sites like design museum in London ) but alas the ones I was looking for a broken in that archive.
The webkit report talks about it (https://bugs.webkit.org/show_bug.cgi?id=281430), while the article doesn't seem to, for some reason. Another HN comment with summary: https://news.ycombinator.com/item?id=42174177
It’s useful for creating games where the graphics are composed of many small browser windows which interact with one another.
For example:
https://youtu.be/3al8prbfK5o?si=loNtyqIfMFkppm5V
If you're reacting to click events, you might want to know the coordinates of where you're clicking. I mostly use this for click and drag stuff since you can get the delta between events and update position of the thing being dragged.
As for why they're checking for coordinates instead of checking for event.type is beyond me. Still I appreciate the write up, it is a good puzzle and relatable to come across code you didn't write and ask, why is it important that the click coordinate is nonzero? Why can't we just check that event.target is the button we want to activate? Why are we using JavaScript at all when a details/summary tag would do the same job?
Relative coordinates sure, but why would you need the absolute position?
I'm with you on the second point - as unlikely as it is for the click to occur at the origin, it's still a legitimate value being abused as an indicator of something that might not actually be true - quite frankly the code was bad to begin with, and it was still bad after the fix.
Because this was easy to do during the 10 days allocated to develop JavaScript in 1995, and then backwards compatibility kicked in :(
I use it for a JavaScript-free CAPTCHA, works well, but it only sends the x and y of mouse click upon clicking on it.
Uh. So they can keep track of what the user is doing?
Why would you just send a document when you can generate a heat map of where the user is on your website. And then complain about the performance and wonder why it costs so much to run a modern website.
The issue isn't so much the coordinate of the mouse within a page, but that the coordinates are relative to the virtual screen layout. It describes where your window is located on the screen(s) and the click is expressed in screen coordinates. Mapping those coordinates to your website's renderer coordinates requires additional calculation.
layerX[1] while non-standard is supported and returns a position relative to the top of the page or the top of the parent element. This makes coordinates positive only and 50,50 is the same for all users. For screenX, 3000,1567 is the same coordinate as 15,37 depending on where the window is located.
[1] https://developer.mozilla.org/en-US/docs/Web/API/MouseEvent/...
Haha. Welcome to the world of analytics. Lots of sites are recording exactly what you are doing on their site including mouse movement at all times.
Why are you filtering for screen coordinates in the first place? What if the user is using e.g. an alternative input device that doesn't have a screen? The `click` event should be enough indication that the user has tried to activate the menu. Why reinvent the wheel?
> Why are you filtering for screen coordinates in the first place?
FTA:
> The isInvokedByMouse was checking whether the click event was invoked by a mouse or touch pointer – rather than a keyboard – by checking if the screenX or screenY coordinates were a positive number.
They were trying to detect whether it was keyboard or mouse activation, and whoever wrote it assumed that screen coordinates of mouse events would always be positive.
Right, but the article doesn't explain why they cared whether it was keyboard or mouse activation. The linked WebKit bug goes into more detail, but it's still lacking an explanation of why alternative, more common and widely deployed strategies (like having a capturing keyup event that triggers earlier in the render loop) wouldn't be a better idea instead.
Also, if you really want to determine whether a MouseEvent is "real" or "synthetic", and you don't want to worry about when mouse events are triggered relative to keyboard events in the event loop (although it doesn't seem very hard to keep track of), it seems like you can use the current click count (i.e., event.detail). This works on both Chrome and Safari—it's 1 for mouse clicks, and 0 for keyboard "clicks", but the spec text is also a little contradictory and under-specified: the "click" event handler says that "the attribute value MUST be 1 when the user begins this action and increments by 1 for each click" (https://w3c.github.io/uievents/#event-type-click) but it also says "This MUST be a non-negative integer indicating the number of consecutive clicks of a pointing device button within a specific time" (https://w3c.github.io/uievents/#current-click-count), and the definition of "pointing device button" seems to exclude synthetic keyboard events (since those are handled separately)
> They were trying to detect whether it was keyboard or mouse activation
But the code shown doesn't do different stuff for Keyboard vs Mouse, it just checks if it is either one of them. Why would you do that? Which other click event types are there that you want to filter?
I omitted that code to keep the article simpler. We want to turn off the animation on keyboard, and move the focus to different things depending on if it's a keyboard or pointer user.
What is the correct way to check if you have a mouse click rather than a keyboard click? I would be tempted to set a module level flag based on what most recently happened - if there was a "mousedown" event more recently we'll set isKeyboard to false and isMouse to true and vice-versa if "keydown" is pressed.
Then we wouldn't need the isInvokedByMouse and isInvokedByKeyboard functions.
Is there a better way? Relying on screen coordinates for this is highly dubious and I would argue a hack.
This seems like a self-inflicted bug resulted from incorrect heuristics (assumption that positive screenX/Y values represent mouse event), and the investigation was complicated by inadequate tracing/logging.
Instead of checking the more appropriate property that other commenters have suggested (pointerType), I'm a bit surprised that the solution given by the author is to patch up the shaky heuristics even more:
> We could deduce from our final two clues the solution: we need to check for negative numbers as well as positive numbers when checking the screenX and screenY coordinates.
Actually that is what we're going to do. I'm hoping to merging in code, soon, that will change it to use pointerId === -1 and then fall back to screenX === 0.
At the time this code was originally written four years ago or whenever it was, not all browsers used PointerEvent for click.
I’m impressed he wrote a blog without having to get authorisation from a dozen layers of bbc management and lawyers.
> I checked the (DOM UI Events) spec to see if that was correct, but there didn't seem to be any specific information about it
Story of my life is finding out the details that apparently matter when I am debugging stuff has not been actually written in the spec (any)
Checking for position != 0 still does not fix it. On Windows and Linux, it is possible for a window to span multiple displays. Someone could set things up so that a valid click target for the button ends up in the (0, 0) position.
I would not write a blog post advertizing how I filed a bug with WebKit for my own coding 101 error: unsafe assumptions and relying on side effects and heuristics.
Actually not just 101, it's basically with all of us at all levels and for life. So they're in good company having made a mistake everyone makes all the time, but it was a mistake on their part not a bug in WebKit, nore even a "interoperability issue" in WebKit or any browser.
They say they weren't aware that negative values were possible and that different browsers produce different values.
Ok, but neither of those matters.
If the function is even allowed to contain or express a negative value (IE right at the lowest basic level, is the returned data type actually a uint, or is it anything else? a regular int? a string?) then negetive values were always a possibility even if you personally never saw one before. Saying "I didn't expect a number below 0" is barely any different from saying "I didn't expect a number above 10000".
The discrepency between browsers doesn't matter and isn't the browsers fault that it tripped you up. You just made a standard boring unsafe assumption like every other programmer ever.
The entire problem is that you cared about something you don't actually care about.
You assumed that there was meaning in the absolute position of the window or the mouse pointer, when there never was, and you don't actually care about those anyway. The absolute position is like the actual internal-only row number in a db. Every row has a unique one, but it's none of your business what it is. There is only meaning in it's position relative to something else like a button, or relative to it's previous position to track movement.
Similarly checking for 0,0 and assuming that means keyboard is just another false heuristic that doesn't actually prove any such thing. The specs may or may not promise that the value will be 0,0 in the event of a keyboard initiated click, but no way it says that it can't be 0,0 any other way.
Don't de ashamed of this error because it's common, but don't be proud of calling these WebKit or browser interoperability bugs.
Do write up and publish the experience though as a warning and lesson to new developers about assumptions and heuristics and relying on side effects that just happen to work on the developers laptop when they tried it once and shipped it.
Also "it's for accessibility" doesn't change anything. Trying to be too smart just makes it worse. Actually that's true just generally for everything.
RE this line of code at the bottom of the article:
Why do you even have to check if screenX and screenY are non-zero (as opposed to just checking typeof event.screenX == "number")? Wouldn't that mean (and this is a wild edge-case) that if someone positioned their browser window so that the menu was in the top left corner (at position 0,0) the event handler would break again? Is this to block synthetic click events like (<div />).click()? Keyboard events don't have a screenX or screenY from what I remember as well.This is just a heuristic to determine if the event is keydown or click.
In the article the author says that the issue is that the same function is handling both events, and they will work on refactoring it to something better.
The normal approach is just have different functions answering to different events. Or using more precise information about the event [1], instead of a heuristic.
[1] A suggestion was made by this poster: https://news.ycombinator.com/item?id=42174436
Why are screenX and screenY relevant here as far as their code goes?
Where they using those values in their code?
Very interesting article but I'm missing the step where it would impact their code ...
...because when they are 0 one can infer that the event came from a keypress rather than a mouse. They want to know this.
Then it would make a lot more sense to check event.pointerType == 'mouse' vs event.pointerType == '' (for keyboard)
That does seem quite obviously better. Even when insisting on checking coordinates why use the coordinate system carried by the event that you have the least control over. Why not .pageX/.pageY, which can't trigger the failure case as long as you make sure the element isn't in the far top left.
Based on the other replies here it seems like it's to differentiate taps vs mouse clicks, keyboard events in js don't have a screenX or screenY property (you can run this in your browser console on this HN post to confirm):
Type in or click on the reply text input and you'll see that the coords array is undefined for all keyboard events. I haven't tried this equivalent on a touch device however, so not sure how it's handled there.This does not work when the mouse is actually at 0,0
An imperfect solution for a situation that will practically never happen and have no noticeable downside.
That flickering Ace Attorney GIF was extremely annoying.
Who cares if a Bolshevik Broadcasting Corporations' navigation bar component broke depending on the external monitor?
I would say many people who browse Hacker News given the fact that, at the time of writing, this is the 4th ranked article. Moreover, who cares which company the person writing this post happened to be working for when discovering this interesting browser edge case?