frustrated man

Nothing is worse then finishing hours of work setting up dozens of tests, getting them to work in Chromium based browsers (Chrome, Edge, etc.) and then passing the same Selenium automation to Firefox and Safari to get StaleElementReferenceExpceptions.

Browsers and the DOM

Browsers render HTML differently. Some, like Chrome and Edge are capable of holding to the node/UUID values of HTML elements while interactive elements are happening in modals, forms and pages. Other browsers have a different approach of HTML digestion, which looses touch with the elements briefly while dynamic aspects are happening in the application/web page.

UUID Mixups

In other words, the HTML class/id/name (or other identifier) hooks to a UUID/node value under the hood. Let’s say you have a modal you’re testing. It has a form with fields: Description, phone number, extension, and date. Two of those fields are being parsed by Javascript (phone number and date). Javascript is determining if a phone number is valid, as well as providing a calendar date picker for the user. When the modal first loads, a UUID is generated for each field. The field with the css id of “description” is loaded with a specific UUID at load. However, when something dynamic occurs in the modal, such as validating a phone number, some browsers regenerate the UUID for the fields. The css id value hasn’t changed, but the underlying UUID has – a manual test (assuming no bugs in the code) will still pass, as you can enter values and submit. But automation is another story.

Webdriver/Selenium and other automation systems may still hold the original UUID that was generated during the browsers original page load. When something triggers a regeneration of the UUID, Webdriver doesn’t get updated, so it issues the command to send_keys(“this is my description”) to “#description” only it fails. What the automation engineer gets back is something like this:

StaleElementReferenceException: Message: The element reference of <input id="faxDescription_JsTcY" class="form-control description" name="description" type="text"> is stale; either the element is no longer attached to the DOM, it is not in the current frame context, or the document has been refreshed 

To a human, the element is certainly still attached to the DOM. The modal hasn’t changed. The page hasn’t refreshed. This browser happened to refresh the UUID of the elements and Webdriver hasn’t picked up the change.

Example

Webdriver is attempting to send text to “#description”, but really it’s sending to a UUID of cac445d2-fd1c-4be4-9c3e-2ea88b1abeed – as this was the original UUID connected to the field – however, under the hood the field now has a UUID of b8ef42dd-5972-44f1-8095-2b94dc0c884e. This is where the breakage happens – we get the Stale Reference error because although the identifier in the HTML is still there, the underlying node or UUID has changed.

TL;DR

Not all browsers read HTML the same way. Chromium based browsers tend to be forgiving on the underlying values of each HTML element, while Firefox and Safari are less forgiving. Simple, and unseen updates to the DOM, can cause some browsers to regenerate the underlying UUID/node values of each element. The values that Webdriver knows from the page load may change during the course of testing, causing the browser to complain that Webdriver is attempting to reach elements that are no longer part of the DOM (even though they are, but their underlying UUID/node values have changed).

Solutions

Some modern automation packages claim to have resolved this type of flakiness, utilizing their own browser drivers. I have no idea if they have overcome these specific issues, as I mostly automation Web and Mobile via Selenium/Appium. There are problems with modern approaches (such as Cypress), at least IMO, so I’ve stayed with the old standards of Selenium. If there is a solution of using a modern approach, and you can afford it, and it’s cross compatible with web, mobile, etc… and fits your needs – then that might be a solution.

Implicit Waits replaced with Explicit

Automation engineers using explicit wait(X) functions, used to get shamed. “Always use waitForElement…” was the best advice and it’s good advice. Implicit waits will poll the DOM, wait a specified time, for an element to become visible/located. Explicit waits, just wait for a specified amount of time. What if the element loads on average 1.5 seconds after page load? You could set a wait(2) to wait 2 seconds, or you could wait for 3 seconds, anticipating QA environment load… but then each test run has to wait the same amount of time. Why wait longer than necessary?

Implicit waits make more sense in general… until you hit the stale reference error. When you get into cross browser automation testing, the luxury of dealing with wonderful browsers is gone. Firefox or Safari will become real issues as they can’t find elements, using wait for’s. In my experience, Firefox will fail instantly, as though the waitFor isn’t even attempted. No matter how long I specify the wait for an element to load, Firefox (if the underlying UUID has changed) will just hang.

I hate this solution, but the only solution I’ve found is to specify time between each dynamic step that might cause the underlying UUID’s of the elements to change. Tests become much longer, but at least they complete.

Conditional Browser Isolation

If you use explicit waits around dynamic elements which are giving stale reference errors, there usually isn’t a need to force Chrome, Edge and other browsers to follow the same logic. Those tests can at least run faster by putting conditional logic that if the browser == [“Safari”, “Firefox”] then run code with waits around it, else run the code without the waits.

Waits Solve the Issue

Waits certainly solve the issue. They are an ugly fix, but they are a fix. I was migrating some test harnesses from Chrome/Edge to include Safari and Firefox and ended up with complete failure on the later side – as almost every page I test has dynamic elements. Safari and Firefox thought the pages were randomly loosing the elements from the DOM, because the underlying UUIDs were changing. It was very frustrating, but putting the time and effort to slow down the test to account for the dynamic validation going on, I was able to get 100% success off Safari and Firefox.

#

Comments are closed

Archives
Categories