"Best service for web application monitoring"
Review, Smashing Apps
Sign Up Login

The AlertFox Blog - Web Performance News

Monday, March 18, 2013

Google Apps Service Disruption, 502 Error

Google Drive is experiencing a service disruption for a few hours already.  Accessing Google Docs is either slow or results in a 502 error:


Google itself says on their Public Status Dashboard:

We're aware of a problem with Google Drive affecting a significant subset of users. The affected users are unable to access Google Drive. We will provide an update by 3/18/13 5:10 PM detailing when we expect to resolve the problem. Please note that this resolution time is an estimate and may change.
 
Our AlertFox Service is - of course - not affected by this. However we are using Google Docs internally and thus use AlertFox to monitor its Status/SLA. Time for some of us here for a tea break...

Update: The service is restored. The overall downtime was 2.5 hours.

Thursday, February 21, 2013

Instant Website Uptime Reports

As of last night we improved our uptime report email generation. You receive the daily, weekly and monthly reports now immediately after midnight in your time zone. Instant information at your finger tips. So if you have meeting with your team on Monday morning, the weekly report from AlertFox is already waiting in your inbox.


Friday, February 1, 2013

New Uptime Report Format Goes Live

Next Monday (February 4, 2013) our new uptime email report format goes live. It features a more readable design and helps you to identify incidents that caused a downtime immediately. All checks that have problems are color-coded with a red background.




And, based on our users feedback, no more milliseconds. All times in the AlertFox Control Panel and reports are reported in seconds now. Of course, the 
accuracy of the measurements itself has not changed.

Friday, December 14, 2012

Web Performance Meetup @ Karlsruhe

I had the chance to participate in the Karlsruhe web performance meetup group kickoff meeting on Tuesday.
It started with a presentation by the organizer, Dennis Westermann (SAP Research). He showed an embarrising page load performance comparison between the Amazon.com website and the Mediamarkt online store. Mediamarkt is the German equivalent of BestBuy, and seems to have similar issues adjusting to the Internet age. It wasn’t a surprise that Amazon’s performance blows Mediamarkt’s performance out of the water. But it was surprising how many low hanging “performance improvement” fruits had been overlooked by Mediamarkt before their high profile launch– such as image compression. If the Mediamarkt team stumbles upon this blog, Dennis’ recommended web performance entry points are:

-  http://www.perfplanet.com/
-  SteveSouders
-  VelocityConf.com
-  WebpageTest.org  (for creating waterfall charts and load time videos)


Next, Jackson Gabbard (Facebook engineer, UK) gave his talk on 

"Balancing HTML and Native [Application Development]".

In his own words: The choice of HTML or native is a crucial one that affects the trajectory of a project. The face-value tradeoffs can lead even the most careful engineer down the wrong path for her or his project. After being an early adopter of HTML as a platform for our mobile applications, Facebook has evolved into a more native system. The reasons for this shift are many and mostly learned at the end of great pains in performance tuning, reliability work, application layout, and user experience. Learn the real-world trade-offs between native application development and HTML from one of the engineers who helped build Facebook mobile.

My grab bag of takeaways:
-  FB internal API was (is?) largely unspec’ed. It grew organically for years.
-  FB goal is to support all mobile devices, at least to an extent.
-  FB users commenting activity is directly related to page (app) performance. If a page loads too slow, users give up commenting quickly.
-  60% of all Facebook logins are on mobile, a dramatic increase compared to last year (I  hope I understood that correctly, if so, a very impressive number).
-  HTML GUI lacks performance vs native app, but wins on flexibility.
-  HTML GUI has a dramatically shorter release cycle (no app store review).
-  His recommendation: HTML for prototyping, native if the feature set is clear.
-  Response times rule of thumb:     200ms -> humans feel something is not instant anymore,  2s -> users get feeling things are too slow.
 
Useful tools mentioned:
-   http://modernizr.com/
-   http://drupal.org/project/browscap


More information on the Karlsruhe Web Performance Meetup, and the next meeting, can be found at http://www.meetup.com/Karlsruhe-Web-Performance-Group/


Tuesday, December 4, 2012

NEW! AlertFox extends 1-minute interval classic monitoring to ALL PRO plans!

The AlertFox team is excited to announce that our newest and most talked about 1-minute interval classic monitoring feature has been extended to ALL of our PRO plans!

Know immediately if there is a problem with your critical applications by monitoring their response times every minute! And in order to be sure we’re absolutely certain of a slowdown when it occurs, we’ll monitor your web page from two separate locations within your selected region.

Here’s a recap:

-- AlertFox System update: New features for classic sensors
-- 1-min interval classic sensors for HTTP (URL), DNS (URL), and FTP (Hostname) at no additional charge for all PRO users
-- PRO1 users can: Monitor up to 10 sites/servers every minute
-- PRO2 users can: Monitor up to 20 sites/servers every minute
-- PRO3 users can: Monitor up to 30 sites/servers every minute

Happy Monitoring!

Thursday, November 8, 2012

New Classic Uptime Monitoring Locations

Here's a feature request we've heard about often: Users want the same global zone selection options in the old-fashioned classic uptime sensors that we currently offer with our real browser based transaction monitoring sensors. As of today, this feature is available in all accounts and for all users!

You can now switch your classic sensors from the default World location (running simultaneously on servers around the world) to measurement stations that "only" run on US, Europe or Asian based measurement servers ("zones"). To avoid redundancy and false alarms, AlertFox classic monitoring always tests simultaneously from at least two different measurement servers, and only triggers an alert if two or more servers report a problem.


Classic (http request-style) monitoring was upgraded. You can now select different measurement zones in addition to the default world view.

Example report running measurements exclusively on US based monitoring stations. 

Friday, October 26, 2012

What’s new: AlertFox updates all measurement servers


The new measurement servers will go live early next week. Key updates include:

    iMacros
        8.0.3.2216
    Firefox
        16.0.1
    iMacros for Firefox
        7.6.00
    Chrome
        22.0.1.1229.96
    iMacros for Chrome
        5.3
    Java
        v7 update 9
    Flash
        11.4.402.287


We’ve done the dirty work. You’re free to sit back, relax, and monitor away!

Thursday, October 25, 2012

Banking outages on the rise: Learn to protect yourself before it’s too late

Last month saw a rise in slowdowns experienced by two leading online banking websites – Bank of America and Chase.  The issues were likely caused by a "distributed denial of service" (DDoS) attack, which can cause interruptions through varied means of flooding traffic to a particular network -- thereby overloading its servers.

With website outages and mounting worldwide traffic on the rise, issues of any kind remain costly, frustrating, and altogether damaging occurrences. For these reasons, it’s imperative to report on slow or non-responsive web applications before your end-users are impacted (be it your employees or customers). 


AlertFox is our solution for proactively monitoring the performance of our websites and web applications. Check it out for yourself and protect your bottom line today -- before it’s too late. 

Monday, October 22, 2012

[Resolved] Amazon EC2 currently down. Affecting Heroku, Reddit,... and AlertFox


Ouch, that was a bigger issue. Amazon reported this:

We are currently experiencing degraded performance for EBS volumes in a single Availability Zone in the US-EAST-1 Region. New launches for EBS backed instances are failing and instances using affected EBS volumes will experience degraded performance.

This does not sound too bad, and we could have worked around it quickly be starting a replacement server in a new availability zone.

Here is what user "hexix" reported in the HackerNews forum, and it describes the issue perfectly:

"AWS completely crapped out. Their status updates that make it sound like it was a tiny little area of their data center. It was pretty much the entire zone and then whenever there is an outage affecting an entire zone it brings down global services and even other zones as well. 
We had servers in the bad zone and started to see issues. When we then started to use the the AWS control panel to spin up replacement servers went to use the cool cloud features that are made for this, the entire thing completely fell on its face. I couldn't launch new EC2 servers either because the API was so bogged down, or the new zone I was launching in was restricted because of load.
Basically, the thing that nobody keeps in mind when they think it's so cool that you can spin up servers to work around outages is that EVERYONE IS DOING THAT. This is Amazon's entire selling point and when it comes to doing it, it doesn't work!"

After almost five hours we were able to start the replacement server and AlertFox is operating normally again. 

No measurements could be made during the downtime. This time is clearly indicated with "white boxes" in in your Tactical Overview dashboard. On the positive site, our system recovered quickly as soon as we could launch the replacement instance. At no point false alarms were triggered.

We apologize for any inconvenience caused. If there are any open questions please contact us.





Wednesday, October 3, 2012

Tips and tricks: How to immediately detect pending items

If you’re looking to monitor certain steps within one of your transactions for a completion (run) time threshold, you can simply add SET !TIMEOUT_STEP <x> before any command within the script that you wish to limit to a maximum of  <x> seconds. Adding this command will enable AlertFox to immediately report and alert on the pertinent error within your transaction monitor (sensor).

 *** The default step timeout is six seconds per TAG or the IMAGESEARCH/IMAGECLICK command. If the six seconds, there is no need to add the SET !TIMEOUT_STEP <x>

Happy monitoring!
Copyright © 2010 AlertFox - Web Application MonitoringHome | Plans and Pricing | Tour | Free Load Time Test | Website Monitoring Wiki | Partner | About