The future of Excel as a programming language.
It may be the oldest piece of software still in widespread use. 34 years ago, just three years after Apple introduced its first Macs, Microsoft released the first version of its familiar Excel spreadsheet app, initially a rough copy of Dan Bricklin’s VisiCalc.
Fast forward to the future, and the Irish Times noted in 2017 that Microsoft CEO Satya Nadella was calling Excel Microsoft’s most important consumer product, pointing out that it had over 750 million users.
So it feels almost historic when one of the world’s largest corporations augments a crucial component of its Office software suite — yet sure enough, Excel has been upgraded with a major new feature.
Microsoft’s researchers believe they’ve now finally transformed Excel into a full-fledged programming language, thanks to the introduction of a new feature called LAMBDA. “With LAMBDA, Excel has become Turing-complete. You can now, in principle, write any computation in the Excel formula language,” a Microsoft blog proclaimed.
“Being Turing complete is the litmus test of a full-fledged programming language,” explained a new article in Visual Studio magazine. And it adds that “early community response has been encouraging,” noting that Microsoft researchers are enthusiastically envisioning skilled Excel users creating functions “that appear seamlessly part of Excel to their colleagues, who simply call them.”
Hey, Office Insiders— LAMBDA for Excel is now available!
✔ Define custom functions in Excel’s formula language.
✔ Transform custom functions and wrap them up in a LAMBDA function.
— Microsoft Excel (@msexcel) February 3, 2021
Here’s a look at these changes and what they portend for the future of Excel as a programming language.
Excel users do much of their work using formulas — where the input into a cell starts with an equals sign followed by some kind of calculation (“=A2 + B2”). Microsoft’s blog calls formulas the world’s most widely used programming language — yet it had always been limited to a pre-defined universe of options.
To roll their own custom functions, users had to use Microsoft’s other macro-based programming language, Visual Basic for Applications. (Or, starting in 2018, JavaScript — and of course, Microsoft’s JavaScript-superset TypeScript.)
In a video appearance at POPL 2021, long-time Microsoft researcher Simon Peyton Jones noted that Excel’s end users were implementing functions using JavaScript, reiterated in a Microsoft Research blog post by a senior principal researcher and a senior principal research manager. “Excel formulas are written by more users than all the C, C++, C#, Java, and Python programmers combined.”
But now all those users can write actual programs without leaving the world of Excel formulas. More specifically, formulas written in Excel can now be “wrapped” inside a named LAMBDA function — and it can then be called from anywhere else in the spreadsheet. And yes, it supports recursive programming, so you can even call your named function from within your named function.
The names are supplied in the “Name Manager” choice tucked away in Excel’s Formulas toolbar. And just like git, users can enter a comment when creating their function.
Microsoft Senior Researcher Jack Williams described it at POPL 2021, “In Excel, it is now possible to build real, full-fledged programming experiences… We can now start to build things that look like real programs.”
The blog post provides an example: a cell whose value includes the name of a function.
The value ultimately displayed in that cell is the output of the specified function — and that function’s argument is…the address of another cell.
Alternately, a function can also accept an array (holding multiple values) as its argument — since Excel began adding the ability to define arrays in Excel in September of 2018.
A function can also send back an array as its return value, with its values “spilling out” into multiple cells. In his video presentation at POPL, Williams writes a function that instantly generates a calendar.)
And Excel’s blog post promises more array manipulation functions in the future. Microsoft’s Research blog even promises the company is working on “efficient implementations of array-processing combinators, such as MAP and REDUCE” to be used on the output of named functions.
And Excel’s program manager Chris Gross hinted about even higher hopes for the future. “I would love to see us add much-needed tools for debugging and authoring formulas,” he wrote in a comment. “Akin to what you get with great IDEs.”
A Dream From 2004
For Simon Peyton Jones, this represents the fulfillment of a long-standing dream. At POPL 2021, the long-time Microsoft researcher and functional programming advocate told the story of visiting Microsoft’s Excel team in 2004. “I learned that Excel is like a supertanker. It has a very high value.
It has a very large mass. And it’s operated by a surprisingly small and heavily-overworked crew. So it’s not easy to change course!”
But 10 years later he discovered that one of the mid-level program managers, “those friendly, receptive folk that I mentioned, had since been promoted to be the great boss of Excel… Suddenly we begin to get senior, executive-level buy-in for some of these ideas.” And, even more importantly, the engineering support to make it happen.
Excel as a programming language
Microsoft’s Research blog calls this new LAMBDA feature “a qualitative shift, not just an incremental change.” Named Lambda functions to offer programmers the high-quality language-like attributes of “composability” and “re-use,” respecting one of the long-standing principles of good coding, namely not repeating work.
A named function can become part of another named function elsewhere in the spreadsheet. But that’s just the beginning of what could be even more elaborate constructions, according to the Research blog post.
It ultimately envisions “sheet-defined functions” where several different functions, each defined in different cells, are collectively used to define a larger function.
And interestingly, the concept of sheet-defined functions was first described in a 2003 research paper co-authored by Jones. “Our case study is unusual in highlighting how programming language insights can be applied to a product not normally considered as a programming language,” Jones had written.
Also speaking at POPL 2021, Advait Sarkar, a senior researcher from Microsoft Research Cambridge, envisioned additional cells used to comment on the code and add formatting flourishes. “We view programming language design as a research discipline whose goal is to create more usable human-computer interfaces,” Sarkar says at one point.
Last year, Sarkar co-authored a paper imagining a new Excel concept called Gridlets — in which a selection of cells could be copy-and-pasted, allowing the reuse of a group of formulas, ultimately offering a kind of object-oriented counterpart to sheet-defined functions.
Gridlets could be instantiated repeatedly, each with its own unique parameters or “properties,” while changes in the original parent gridlet will propagate to its children.
It’s the work of the “Calc Intelligence” team at Microsoft Research Cambridge, which has as its stated goal enhancing Excel as a programming language. Right now it’s only available to members of the “Office Insider” program’s beta channel.
Although even within that program, eager users complained that they weren’t rolling it out fast enough, remembers Excel’s head of product Brian Jones in his POPL 2021 appearance.
Williams said the feature had generated a lot of interest. “Even within the first few weeks of LAMBDAs being released, there were videos with hundreds of thousands of views on YouTube.
There’s really big space to explore how spreadsheets and higher-order functional programming can co-exist and produce a unique experience.”
The post on Microsoft’s Excel blog ends by promising named LAMBDA functions are just the beginning, adding “join us for the journey.”
Microsoft Excel is a spreadsheet developed by Microsoft for Windows, macOS, Android, iOS and iPadOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications (VBA). Excel forms part of the Microsoft 365 suite of software.
A simple bar graph being created in Excel, running on Windows 11 |
|
Developer(s) | Microsoft |
---|---|
Initial release | November 19, 1987; 35 years ago |
Stable release |
2103 (16.0.13901.20400) |
Written in | C++ (back-end)[2] |
Operating system | Microsoft Windows |
Type | Spreadsheet |
License | Trialware[3] |
Website | microsoft.com/en-us/microsoft-365/excel |
Excel for Mac (version 16.67), running on macOS Big Sur 11.5.2 |
|
Developer(s) | Microsoft |
---|---|
Initial release | September 30, 1985; 37 years ago |
Stable release |
16.70 (Build 23021201) |
Written in | C++ (back-end), Objective-C (API/UI)[2] |
Operating system | macOS |
Type | Spreadsheet |
License | Proprietary commercial software |
Website | products.office.com/mac |
Excel for Android running on Android 13 |
|
Developer(s) | Microsoft Corporation |
---|---|
Stable release |
16.0.14729.20146 |
Operating system | Android Oreo and later |
Type | Spreadsheet |
License | Proprietary commercial software |
Website | products.office.com/en-us/excel |
Developer(s) | Microsoft Corporation |
---|---|
Stable release |
2.70.1 |
Operating system | iOS 15 or later iPadOS 15 or later |
Type | Spreadsheet |
License | Proprietary commercial software |
Website | products.office.com/en-us/excel |
Features
Basic operation
Microsoft Excel has the basic features of all spreadsheets,[7] using a grid of cells arranged in numbered rows and letter-named columns to organize data manipulations like arithmetic operations. It has a battery of supplied functions to answer statistical, engineering, and financial needs. In addition, it can display data as line graphs, histograms and charts, and with a very limited three-dimensional graphical display. It allows sectioning of data to view its dependencies on various factors for different perspectives (using pivot tables and the scenario manager).[8] A PivotTable is a tool for data analysis. It does this by simplifying large data sets via PivotTable fields. It has a programming aspect, Visual Basic for Applications, allowing the user to employ a wide variety of numerical methods, for example, for solving differential equations of mathematical physics,[9][10] and then reporting the results back to the spreadsheet. It also has a variety of interactive features allowing user interfaces that can completely hide the spreadsheet from the user, so the spreadsheet presents itself as a so-called application, or decision support system (DSS), via a custom-designed user interface, for example, a stock analyzer,[11] or in general, as a design tool that asks the user questions and provides answers and reports.[12][13] In a more elaborate realization, an Excel application can automatically poll external databases and measuring instruments using an update schedule,[14] analyze the results, make a Word report or PowerPoint slide show, and e-mail these presentations on a regular basis to a list of participants. Excel was not designed to be used as a database.[citation needed]
Microsoft allows for a number of optional command-line switches to control the manner in which Excel starts.[15]
Functions
Excel 2016 has 484 functions.[16] Of these, 360 existed prior to Excel 2010. Microsoft classifies these functions in 14 categories. Of the 484 current functions, 386 may be called from VBA as methods of the object «WorksheetFunction»[17] and 44 have the same names as VBA functions.[18]
With the introduction of LAMBDA, Excel will become Turing complete.[19]
Macro programming
VBA programming
Use of a user-defined function sq(x) in Microsoft Excel. The named variables x & y are identified in the Name Manager. The function sq is introduced using the Visual Basic editor supplied with Excel.
Subroutine in Excel calculates the square of named column variable x read from the spreadsheet, and writes it into the named column variable y.
The Windows version of Excel supports programming through Microsoft’s Visual Basic for Applications (VBA), which is a dialect of Visual Basic. Programming with VBA allows spreadsheet manipulation that is awkward or impossible with standard spreadsheet techniques. Programmers may write code directly using the Visual Basic Editor (VBE), which includes a window for writing code, debugging code, and code module organization environment. The user can implement numerical methods as well as automating tasks such as formatting or data organization in VBA[20] and guide the calculation using any desired intermediate results reported back to the spreadsheet.
VBA was removed from Mac Excel 2008, as the developers did not believe that a timely release would allow porting the VBA engine natively to Mac OS X. VBA was restored in the next version, Mac Excel 2011,[21] although the build lacks support for ActiveX objects, impacting some high level developer tools.[22]
A common and easy way to generate VBA code is by using the Macro Recorder.[23] The Macro Recorder records actions of the user and generates VBA code in the form of a macro. These actions can then be repeated automatically by running the macro. The macros can also be linked to different trigger types like keyboard shortcuts, a command button or a graphic. The actions in the macro can be executed from these trigger types or from the generic toolbar options. The VBA code of the macro can also be edited in the VBE. Certain features such as loop functions and screen prompt by their own properties, and some graphical display items, cannot be recorded but must be entered into the VBA module directly by the programmer. Advanced users can employ user prompts to create an interactive program, or react to events such as sheets being loaded or changed.
Macro Recorded code may not be compatible with Excel versions. Some code that is used in Excel 2010 cannot be used in Excel 2003. Making a Macro that changes the cell colors and making changes to other aspects of cells may not be backward compatible.
VBA code interacts with the spreadsheet through the Excel Object Model,[24] a vocabulary identifying spreadsheet objects, and a set of supplied functions or methods that enable reading and writing to the spreadsheet and interaction with its users (for example, through custom toolbars or command bars and message boxes). User-created VBA subroutines execute these actions and operate like macros generated using the macro recorder, but are more flexible and efficient.
History
From its first version Excel supported end-user programming of macros (automation of repetitive tasks) and user-defined functions (extension of Excel’s built-in function library). In early versions of Excel, these programs were written in a macro language whose statements had formula syntax and resided in the cells of special-purpose macro sheets (stored with file extension .XLM in Windows.) XLM was the default macro language for Excel through Excel 4.0.[25] Beginning with version 5.0 Excel recorded macros in VBA by default but with version 5.0 XLM recording was still allowed as an option. After version 5.0 that option was discontinued. All versions of Excel, including Excel 2021 are capable of running an XLM macro, though Microsoft discourages their use.[26]
Charts
Graph made using Microsoft Excel
Excel supports charts, graphs, or histograms generated from specified groups of cells. It also supports Pivot Charts that allow for a chart to be linked directly to a Pivot table. This allows the chart to be refreshed with the Pivot Table. The generated graphic component can either be embedded within the current sheet or added as a separate object.
These displays are dynamically updated if the content of cells changes. For example, suppose that the important design requirements are displayed visually; then, in response to a user’s change in trial values for parameters, the curves describing the design change shape, and their points of intersection shift, assisting the selection of the best design.
Add-ins
Additional features are available using add-ins. Several are provided with Excel, including:
- Analysis ToolPak: Provides data analysis tools for statistical and engineering analysis (includes analysis of variance and regression analysis)
- Analysis ToolPak VBA: VBA functions for Analysis ToolPak
- Euro Currency Tools: Conversion and formatting for euro currency
- Solver Add-In: Tools for optimization and equation solving
Data storage and communication
Number of rows and columns
Versions of Excel up to 7.0 had a limitation in the size of their data sets of 16K (214 = 16384) rows. Versions 8.0 through 11.0 could handle 64K (216 = 65536) rows and 256 columns (28 as label ‘IV’). Version 12.0 onwards, including the current Version 16.x, can handle over 1M (220 = 1048576) rows, and 16384 (214, labeled as column ‘XFD’) columns.[27]
File formats
Filename extension |
.xls, (.xlsx, .xlsm, .xlsb — Excel 2007) |
---|---|
Internet media type |
application/vnd.ms-excel |
Uniform Type Identifier (UTI) | com.microsoft.excel.xls |
Developed by | Microsoft |
Type of format | Spreadsheet |
Microsoft Excel up until 2007 version used a proprietary binary file format called Excel Binary File Format (.XLS) as its primary format.[28] Excel 2007 uses Office Open XML as its primary file format, an XML-based format that followed after a previous XML-based format called «XML Spreadsheet» («XMLSS»), first introduced in Excel 2002.[29]
Although supporting and encouraging the use of new XML-based formats as replacements, Excel 2007 remained backwards-compatible with the traditional, binary formats. In addition, most versions of Microsoft Excel can read CSV, DBF, SYLK, DIF, and other legacy formats. Support for some older file formats was removed in Excel 2007.[30] The file formats were mainly from DOS-based programs.
Binary
OpenOffice.org has created documentation of the Excel format. Two epochs of the format exist: the 97-2003 OLE format, and the older stream format.[31] Microsoft has made the Excel binary format specification available to freely download.[32]
XML Spreadsheet
The XML Spreadsheet format introduced in Excel 2002[29] is a simple, XML based format missing some more advanced features like storage of VBA macros. Though the intended file extension for this format is .xml, the program also correctly handles XML files with .xls extension. This feature is widely used by third-party applications (e.g. MySQL Query Browser) to offer «export to Excel» capabilities without implementing binary file format. The following example will be correctly opened by Excel if saved either as Book1.xml or Book1.xls:
<?xml version="1.0"?> <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40"> <Worksheet ss:Name="Sheet1"> <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="2" x:FullColumns="1" x:FullRows="1"> <Row> <Cell><Data ss:Type="String">Name</Data></Cell> <Cell><Data ss:Type="String">Example</Data></Cell> </Row> <Row> <Cell><Data ss:Type="String">Value</Data></Cell> <Cell><Data ss:Type="Number">123</Data></Cell> </Row> </Table> </Worksheet> </Workbook>
Current file extensions
Microsoft Excel 2007, along with the other products in the Microsoft Office 2007 suite, introduced new file formats. The first of these (.xlsx) is defined in the Office Open XML (OOXML) specification.
Format | Extension | Description |
---|---|---|
Excel Workbook | .xlsx
|
The default Excel 2007 and later workbook format. In reality, a ZIP compressed archive with a directory structure of XML text documents. Functions as the primary replacement for the former binary .xls format, although it does not support Excel macros for security reasons. Saving as .xlsx offers file size reduction over .xls[33] |
Excel Macro-enabled Workbook | .xlsm
|
As Excel Workbook, but with macro support. |
Excel Binary Workbook | .xlsb
|
As Excel Macro-enabled Workbook, but storing information in binary form rather than XML documents for opening and saving documents more quickly and efficiently. Intended especially for very large documents with tens of thousands of rows, and/or several hundreds of columns. This format is very useful for shrinking large Excel files as is often the case when doing data analysis. |
Excel Macro-enabled Template | .xltm
|
A template document that forms a basis for actual workbooks, with macro support. The replacement for the old .xlt format. |
Excel Add-in | .xlam
|
Excel add-in to add extra functionality and tools. Inherent macro support because of the file purpose. |
Old file extensions
Format | Extension | Description |
---|---|---|
Spreadsheet | .xls
|
Main spreadsheet format which holds data in worksheets, charts, and macros |
Add-in (VBA) | .xla
|
Adds custom functionality; written in VBA |
Toolbar | .xlb
|
The file extension where Microsoft Excel custom toolbar settings are stored. |
Chart | .xlc
|
A chart created with data from a Microsoft Excel spreadsheet that only saves the chart. To save the chart and spreadsheet save as .XLS. XLC is not supported in Excel 2007 or in any newer versions of Excel. |
Dialog | .xld
|
Used in older versions of Excel. |
Archive | .xlk
|
A backup of an Excel Spreadsheet |
Add-in (DLL) | .xll
|
Adds custom functionality; written in C++/C, Fortran, etc. and compiled in to a special dynamic-link library |
Macro | .xlm
|
A macro is created by the user or pre-installed with Excel. |
Template | .xlt
|
A pre-formatted spreadsheet created by the user or by Microsoft Excel. |
Module | .xlv
|
A module is written in VBA (Visual Basic for Applications) for Microsoft Excel |
Library | .DLL
|
Code written in VBA may access functions in a DLL, typically this is used to access the Windows API |
Workspace | .xlw
|
Arrangement of the windows of multiple Workbooks |
Using other Windows applications
Windows applications such as Microsoft Access and Microsoft Word, as well as Excel can communicate with each other and use each other’s capabilities. The most common are Dynamic Data Exchange: although strongly deprecated by Microsoft, this is a common method to send data between applications running on Windows, with official MS publications referring to it as «the protocol from hell».[34] As the name suggests, it allows applications to supply data to others for calculation and display. It is very common in financial markets, being used to connect to important financial data services such as Bloomberg and Reuters.
OLE Object Linking and Embedding allows a Windows application to control another to enable it to format or calculate data. This may take on the form of «embedding» where an application uses another to handle a task that it is more suited to, for example a PowerPoint presentation may be embedded in an Excel spreadsheet or vice versa.[35][36][37][38]
Using external data
Excel users can access external data sources via Microsoft Office features such as (for example) .odc
connections built with the Office Data Connection file format. Excel files themselves may be updated using a Microsoft supplied ODBC driver.
Excel can accept data in real-time through several programming interfaces, which allow it to communicate with many data sources such as Bloomberg and Reuters (through addins such as Power Plus Pro).
- DDE: «Dynamic Data Exchange» uses the message passing mechanism in Windows to allow data to flow between Excel and other applications. Although it is easy for users to create such links, programming such links reliably is so difficult that Microsoft, the creators of the system, officially refer to it as «the protocol from hell».[34] In spite of its many issues DDE remains the most common way for data to reach traders in financial markets.
- Network DDE Extended the protocol to allow spreadsheets on different computers to exchange data. Starting with Windows Vista, Microsoft no longer supports the facility.[39]
- Real Time Data: RTD although in many ways technically superior to DDE, has been slow to gain acceptance, since it requires non-trivial programming skills, and when first released was neither adequately documented nor supported by the major data vendors.[40][41]
Alternatively, Microsoft Query provides ODBC-based browsing within Microsoft Excel.[42][43][44]
Export and migration of spreadsheets
Programmers have produced APIs to open Excel spreadsheets in a variety of applications and environments other than Microsoft Excel. These include opening Excel documents on the web using either ActiveX controls, or plugins like the Adobe Flash Player. The Apache POI opensource project provides Java libraries for reading and writing Excel spreadsheet files.
Password protection
Microsoft Excel protection offers several types of passwords:
- Password to open a document[45]
- Password to modify a document[46]
- Password to unprotect the worksheet
- Password to protect workbook
- Password to protect the sharing workbook[47]
All passwords except password to open a document can be removed instantly regardless of the Microsoft Excel version used to create the document. These types of passwords are used primarily for shared work on a document. Such password-protected documents are not encrypted, and a data sources from a set password is saved in a document’s header. Password to protect workbook is an exception – when it is set, a document is encrypted with the standard password «VelvetSweatshop», but since it is known to the public, it actually does not add any extra protection to the document. The only type of password that can prevent a trespasser from gaining access to a document is password to open a document. The cryptographic strength of this kind of protection depends strongly on the Microsoft Excel version that was used to create the document.
In Microsoft Excel 95 and earlier versions, the password to open is converted to a 16-bit key that can be instantly cracked. In Excel 97/2000 the password is converted to a 40-bit key, which can also be cracked very quickly using modern equipment. As regards services that use rainbow tables (e.g. Password-Find), it takes up to several seconds to remove protection. In addition, password-cracking programs can brute-force attack passwords at a rate of hundreds of thousands of passwords a second, which not only lets them decrypt a document but also find the original password.
In Excel 2003/XP the encryption is slightly better – a user can choose any encryption algorithm that is available in the system (see Cryptographic Service Provider). Due to the CSP, an Excel file cannot be decrypted, and thus the password to open cannot be removed, though the brute-force attack speed remains quite high. Nevertheless, the older Excel 97/2000 algorithm is set by the default. Therefore, users who do not change the default settings lack reliable protection of their documents.
The situation changed fundamentally in Excel 2007, where the modern AES algorithm with a key of 128 bits started being used for decryption, and a 50,000-fold use of the hash function SHA1 reduced the speed of brute-force attacks down to hundreds of passwords per second. In Excel 2010, the strength of the protection by the default was increased two times due to the use of a 100,000-fold SHA1 to convert a password to a key.
Other platforms
Excel for mobile
Excel Mobile is a spreadsheet program that can edit XLSX files. It can edit and format text in cells, calculate formulas, search within the spreadsheet, sort rows and columns, freeze panes, filter the columns, add comments, and create charts. It cannot add columns or rows except at the edge of the document, rearrange columns or rows, delete rows or columns, or add spreadsheet tabs.[48][49][50][51][52][53] The 2007 version has the ability to use a full-screen mode to deal with limited screen resolution, as well as split panes to view different parts of a worksheet at one time.[51] Protection settings, zoom settings, autofilter settings, certain chart formatting, hidden sheets, and other features are not supported on Excel Mobile, and will be modified upon opening and saving a workbook.[52] In 2015, Excel Mobile became available for Windows 10 and Windows 10 Mobile on Windows Store.[54][55]
Excel for the web
Excel for the web is a free lightweight version of Microsoft Excel available as part of Office on the web, which also includes web versions of Microsoft Word and Microsoft PowerPoint.
Excel for the web can display most of the features available in the desktop versions of Excel, although it may not be able to insert or edit them. Certain data connections are not accessible on Excel for the web, including with charts that may use these external connections. Excel for the web also cannot display legacy features, such as Excel 4.0 macros or Excel 5.0 dialog sheets. There are also small differences between how some of the Excel functions work.[56]
Microsoft Excel Viewer
Microsoft Excel Viewer was a freeware program for Microsoft Windows for viewing and printing spreadsheet documents created by Excel.[57] Microsoft retired the viewer in April 2018 with the last security update released in February 2019 for Excel Viewer 2007 (SP3).[58][59]
The first version released by Microsoft was Excel 97 Viewer.[60][61] Excel 97 Viewer was supported in Windows CE for Handheld PCs.[62] In October 2004, Microsoft released Excel Viewer 2003.[63] In September 2007, Microsoft released Excel Viewer 2003 Service Pack 3 (SP3).[64] In January 2008, Microsoft released Excel Viewer 2007 (featuring a non-collapsible Ribbon interface).[65] In April 2009, Microsoft released Excel Viewer 2007 Service Pack 2 (SP2).[66] In October 2011, Microsoft released Excel Viewer 2007 Service Pack 3 (SP3).[67]
Microsoft advises to view and print Excel files for free to use the Excel Mobile application for Windows 10 and for Windows 7 and Windows 8 to upload the file to OneDrive and use Excel for the web with a Microsoft account to open them in a browser.[58][68]
Quirks
In addition to issues with spreadsheets in general, other problems specific to Excel include numeric precision, misleading statistics functions, mod function errors, date limitations and more.
Numeric precision
Excel maintains 15 figures in its numbers, but they are not always accurate: the bottom line should be the same as the top line.
Despite the use of 15-figure precision, Excel can display many more figures (up to thirty) upon user request. But the displayed figures are not those actually used in its computations, and so, for example, the difference of two numbers may differ from the difference of their displayed values. Although such departures are usually beyond the 15th decimal, exceptions do occur, especially for very large or very small numbers. Serious errors can occur if decisions are made based upon automated comparisons of numbers (for example, using the Excel If function), as equality of two numbers can be unpredictable.[citation needed]
In the figure, the fraction 1/9000 is displayed in Excel. Although this number has a decimal representation that is an infinite string of ones, Excel displays only the leading 15 figures. In the second line, the number one is added to the fraction, and again Excel displays only 15 figures. In the third line, one is subtracted from the sum using Excel. Because the sum in the second line has only eleven 1’s after the decimal, the difference when 1 is subtracted from this displayed value is three 0’s followed by a string of eleven 1’s. However, the difference reported by Excel in the third line is three 0’s followed by a string of thirteen 1’s and two extra erroneous digits. This is because Excel calculates with about half a digit more than it displays.
Excel works with a modified 1985 version of the IEEE 754 specification.[69] Excel’s implementation involves conversions between binary and decimal representations, leading to accuracy that is on average better than one would expect from simple fifteen digit precision, but that can be worse. See the main article for details.
Besides accuracy in user computations, the question of accuracy in Excel-provided functions may be raised. Particularly in the arena of statistical functions, Excel has been criticized for sacrificing accuracy for speed of calculation.[70][71]
As many calculations in Excel are executed using VBA, an additional issue is the accuracy of VBA, which varies with variable type and user-requested precision.[72]
Statistical functions
The accuracy and convenience of statistical tools in Excel has been criticized,[73][74][75][76][77] as mishandling missing data, as returning incorrect values due to inept handling of round-off and large numbers, as only selectively updating calculations on a spreadsheet when some cell values are changed, and as having a limited set of statistical tools. Microsoft has announced some of these issues are addressed in Excel 2010.[78]
Excel MOD function error
Excel has issues with modulo operations. In the case of excessively large results, Excel will return the error warning #NUM! instead of an answer.[79]
Fictional leap day in the year 1900
Excel includes February 29, 1900, incorrectly treating 1900 as a leap year, even though e.g. 2100 is correctly treated as a non-leap year.[80][81] The bug originated from Lotus 1-2-3 (deliberately implemented to save computer memory), and was also purposely implemented in Excel, for the purpose of bug compatibility.[82] This legacy has later been carried over into Office Open XML file format.[83]
Thus a (not necessarily whole) number greater than or equal to 61 interpreted as a date and time are the (real) number of days after December 30, 1899, 0:00, a non-negative number less than 60 is the number of days after December 31, 1899, 0:00, and numbers with whole part 60 represent the fictional day.
Date range
Excel supports dates with years in the range 1900–9999, except that December 31, 1899, can be entered as 0 and is displayed as 0-jan-1900.
Converting a fraction of a day into hours, minutes and days by treating it as a moment on the day January 1, 1900, does not work for a negative fraction.[84]
Conversion problems
Entering text that happens to be in a form that is interpreted as a date, the text can be unintentionally changed to a standard date format. A similar problem occurs when a text happens to be in the form of a floating-point notation of a number. In these cases the original exact text cannot be recovered from the result. Formatting the cell as TEXT before entering ambiguous text prevents Excel from converting to a date.
This issue has caused a well known problem in the analysis of DNA, for example in bioinformatics. As first reported in 2004,[85] genetic scientists found that Excel automatically and incorrectly converts certain gene names into dates. A follow-up study in 2016 found many peer reviewed scientific journal papers had been affected and that «Of the selected journals, the proportion of published articles with Excel files containing gene lists that are affected by gene name errors is 19.6 %.»[86] Excel parses the copied and pasted data and sometimes changes them depending on what it thinks they are. For example, MARCH1 (Membrane Associated Ring-CH-type finger 1) gets converted to the date March 1 (1-Mar) and SEPT2 (Septin 2) is converted into September 2 (2-Sep) etc.[87] While some secondary news sources[88] reported this as a fault with Excel, the original authors of the 2016 paper placed the blame with the researchers misusing Excel.[86][89]
In August 2020 the HUGO Gene Nomenclature Committee (HGNC) published new guidelines in the journal Nature regarding gene naming in order to avoid issues with «symbols that affect data handling and retrieval.» So far 27 genes have been renamed, including changing MARCH1 to MARCHF1 and SEPT1 to SEPTIN1 in order to avoid accidental conversion of the gene names into dates.[90]
Errors with large strings
The following functions return incorrect results when passed a string longer than 255 characters:[91]
type()
incorrectly returns 16, meaning «Error value»IsText()
, when called as a method of the VBA objectWorksheetFunction
(i.e.,WorksheetFunction.IsText()
in VBA), incorrectly returns «false».
Filenames
Microsoft Excel will not open two documents with the same name and instead will display the following error:
- A document with the name ‘%s’ is already open. You cannot open two documents with the same name, even if the documents are in different folders. To open the second document, either close the document that is currently open, or rename one of the documents.[92]
The reason is for calculation ambiguity with linked cells. If there is a cell ='[Book1.xlsx]Sheet1'!$G$33
, and there are two books named «Book1» open, there is no way to tell which one the user means.[93]
Versions
Early history
Microsoft originally marketed a spreadsheet program called Multiplan in 1982. Multiplan became very popular on CP/M systems, but on MS-DOS systems it lost popularity to Lotus 1-2-3. Microsoft released the first version of Excel for the Macintosh on September 30, 1985, and the first Windows version was 2.05 (to synchronize with the Macintosh version 2.2) on November 19, 1987.[94][95] Lotus was slow to bring 1-2-3 to Windows and by the early 1990s, Excel had started to outsell 1-2-3 and helped Microsoft achieve its position as a leading PC software developer. This accomplishment solidified Microsoft as a valid competitor and showed its future of developing GUI software. Microsoft maintained its advantage with regular new releases, every two years or so.
Microsoft Windows
Excel 2.0 is the first version of Excel for the Intel platform. Versions prior to 2.0 were only available on the Apple Macintosh.
Excel 2.0 (1987)
The first Windows version was labeled «2» to correspond to the Mac version. It was announced on October 6, 1987, and released on November 19.[96] This included a run-time version of Windows.[97]
BYTE in 1989 listed Excel for Windows as among the «Distinction» winners of the BYTE Awards. The magazine stated that the port of the «extraordinary» Macintosh version «shines», with a user interface as good as or better than the original.
Excel 3.0 (1990)
Included toolbars, drawing capabilities, outlining, add-in support, 3D charts, and many more new features.[97]
Excel 4.0 (1992)
Introduced auto-fill.[98]
Also, an easter egg in Excel 4.0 reveals a hidden animation of a dancing set of numbers 1 through 3, representing Lotus 1-2-3, which is then crushed by an Excel logo.[99]
Excel 5.0 (1993)
With version 5.0, Excel has included Visual Basic for Applications (VBA), a programming language based on Visual Basic which adds the ability to automate tasks in Excel and to provide user-defined functions (UDF) for use in worksheets. VBA includes a fully featured integrated development environment (IDE). Macro recording can produce VBA code replicating user actions, thus allowing simple automation of regular tasks. VBA allows the creation of forms and in‑worksheet controls to communicate with the user. The language supports use (but not creation) of ActiveX (COM) DLL’s; later versions add support for class modules allowing the use of basic object-oriented programming techniques.
The automation functionality provided by VBA made Excel a target for macro viruses. This caused serious problems until antivirus products began to detect these viruses. Microsoft belatedly took steps to prevent the misuse by adding the ability to disable macros completely, to enable macros when opening a workbook or to trust all macros signed using a trusted certificate.
Versions 5.0 to 9.0 of Excel contain various Easter eggs, including a «Hall of Tortured Souls», a Doom-like minigame, although since version 10 Microsoft has taken measures to eliminate such undocumented features from their products.[100]
5.0 was released in a 16-bit x86 version for Windows 3.1 and later in a 32-bit version for NT 3.51 (x86/Alpha/PowerPC)
Excel 95 (v7.0)
Released in 1995 with Microsoft Office for Windows 95, this is the first major version after Excel 5.0, as there is no Excel 6.0 with all of the Office applications standardizing on the same major version number.
Internal rewrite to 32-bits. Almost no external changes, but faster and more stable.
Excel 95 contained a hidden Doom-like mini-game called «The Hall of Tortured Souls», a series of rooms featuring the names and faces of the developers as an easter egg.[101]
Excel 97 (v8.0)
Included in Office 97 (for x86 and Alpha). This was a major upgrade that introduced the paper clip office assistant and featured standard VBA used instead of internal Excel Basic. It introduced the now-removed Natural Language labels.
This version of Excel includes a flight simulator as an Easter Egg.
Excel 2000 (v9.0)
Included in Office 2000. This was a minor upgrade but introduced an upgrade to the clipboard where it can hold multiple objects at once. The Office Assistant, whose frequent unsolicited appearance in Excel 97 had annoyed many users, became less intrusive.
A small 3-D game called «Dev Hunter» (inspired by Spy Hunter) was included as an easter egg.[102][103]
Excel 2002 (v10.0)
Included in Office XP. Very minor enhancements.
Excel 2003 (v11.0)
Included in Office 2003. Minor enhancements.
Excel 2007 (v12.0)
Included in Office 2007. This release was a major upgrade from the previous version. Similar to other updated Office products, Excel in 2007 used the new Ribbon menu system. This was different from what users were used to, and was met with mixed reactions. One study reported fairly good acceptance by users except highly experienced users and users of word processing applications with a classical WIMP interface, but was less convinced in terms of efficiency and organization.[104] However, an online survey reported that a majority of respondents had a negative opinion of the change, with advanced users being «somewhat more negative» than intermediate users, and users reporting a self-estimated reduction in productivity.
Added functionality included Tables,[105] and the SmartArt set of editable business diagrams. Also added was an improved management of named variables through the Name Manager, and much-improved flexibility in formatting graphs, which allow (x, y) coordinate labeling and lines of arbitrary weight. Several improvements to pivot tables were introduced.
Also like other office products, the Office Open XML file formats were introduced, including .xlsm for a workbook with macros and .xlsx for a workbook without macros.[106]
Specifically, many of the size limitations of previous versions were greatly increased. To illustrate, the number of rows was now 1,048,576 (220) and columns was 16,384 (214; the far-right column is XFD). This changes what is a valid A1 reference versus a named range. This version made more extensive use of multiple cores for the calculation of spreadsheets; however, VBA macros are not handled in parallel and XLL add‑ins were only executed in parallel if they were thread-safe and this was indicated at registration.
Excel 2010 (v14.0)
Microsoft Excel 2010 running on Windows 7
Included in Office 2010, this is the next major version after v12.0, as version number 13 was skipped.
Minor enhancements and 64-bit support,[107] including the following:
- Multi-threading recalculation (MTR) for commonly used functions
- Improved pivot tables
- More conditional formatting options
- Additional image editing capabilities
- In-cell charts called sparklines
- Ability to preview before pasting
- Office 2010 backstage feature for document-related tasks
- Ability to customize the Ribbon
- Many new formulas, most highly specialized to improve accuracy[108]
Excel 2013 (v15.0)
Included in Office 2013, along with a lot of new tools included in this release:
- Improved Multi-threading and Memory Contention
- FlashFill[109]
- Power View[110]
- Power Pivot[111]
- Timeline Slicer
- Windows App
- Inquire[112]
- 50 new functions[113]
Excel 2016 (v16.0)
Included in Office 2016, along with a lot of new tools included in this release:
- Power Query integration
- Read-only mode for Excel
- Keyboard access for Pivot Tables and Slicers in Excel
- New Chart Types
- Quick data linking in Visio
- Excel forecasting functions
- Support for multiselection of Slicer items using touch
- Time grouping and Pivot Chart Drill Down
- Excel data cards[114]
Excel 2019, Excel 2021, Office 365 and subsequent (v16.0)
Microsoft no longer releases Office or Excel in discrete versions. Instead, features are introduced automatically over time using Windows Update. The version number remains 16.0. Thereafter only the approximate dates when features appear can now be given.
- Dynamic Arrays. These are essentially Array Formulas but they «Spill» automatically into neighboring cells and does not need the ctrl-shift-enter to create them. Further, dynamic arrays are the default format, with new «@» and «#» operators to provide compatibility with previous versions. This is perhaps the biggest structural change since 2007, and is in response to a similar feature in Google Sheets. Dynamic arrays started appearing in pre-releases about 2018, and as of March 2020 are available in published versions of Office 365 provided a user selected «Office Insiders».
Apple Macintosh
Microsoft Excel for Mac 2011
- 1985 Excel 1.0
- 1988 Excel 1.5
- 1989 Excel 2.2
- 1990 Excel 3.0
- 1992 Excel 4.0
- 1993 Excel 5.0 (part of Office 4.x—Final Motorola 680×0 version[115] and first PowerPC version)
- 1998 Excel 8.0 (part of Office 98)
- 2000 Excel 9.0 (part of Office 2001)
- 2001 Excel 10.0 (part of Office v. X)
- 2004 Excel 11.0 (part of Office 2004)
- 2008 Excel 12.0 (part of Office 2008)
- 2010 Excel 14.0 (part of Office 2011)
- 2015 Excel 15.0 (part of Office 2016—Office 2016 for Mac brings the Mac version much closer to parity with its Windows cousin, harmonizing many of the reporting and high-level developer functions, while bringing the ribbon and styling into line with its PC counterpart.)[116]
OS/2
- 1989 Excel 2.2
- 1990 Excel 2.3
- 1991 Excel 3.0
Summary
Legend: | Old version, not maintained | Older version, still maintained | Current stable version |
---|
Year | Name | Version | Comments |
---|---|---|---|
1987 | Excel 2 | 2.0 | Renumbered to 2 to correspond with contemporary Macintosh version. Supported macros (later known as Excel 4 macros). |
1990 | Excel 3 | 3.0 | Added 3D graphing capabilities |
1992 | Excel 4 | 4.0 | Introduced auto-fill feature |
1993 | Excel 5 | 5.0 | Included Visual Basic for Applications (VBA) and various object-oriented options |
1995 | Excel 95 | 7.0 | Renumbered for contemporary Word version. Both programs were packaged in Microsoft Office by this time. |
1997 | Excel 97 | 8.0 | |
2000 | Excel 2000 | 9.0 | Part of Microsoft Office 2000, which was itself part of Windows Millennium (also known as «Windows ME»). |
2002 | Excel 2002 | 10.0 | |
2003 | Excel 2003 | 11.0 | Released only 1 year later to correspond better with the rest of Microsoft Office (Word, PowerPoint, etc.). |
2007 | Excel 2007 | 12.0 | |
2010 | Excel 2010 | 14.0 | Due to superstitions surrounding the number 13, Excel 13 was skipped in version counting. |
2013 | Excel 2013 | 15.0 | Introduced 50 more mathematical functions (available as pre-packaged commands, rather than typing the formula manually). |
2016 | Excel 2016 | 16.0 | Part of Microsoft Office 2016 |
Year | Name | Version | Comments |
---|---|---|---|
1985 | Excel 1 | 1.0 | Initial version of Excel. Supported macros (later known as Excel 4 macros). |
1988 | Excel 1.5 | 1.5 | |
1989 | Excel 2 | 2.2 | |
1990 | Excel 3 | 3.0 | |
1992 | Excel 4 | 4.0 | |
1993 | Excel 5 | 5.0 | Only available on PowerPC-based Macs. First PowerPC version. |
1998 | Excel 98 | 8.0 | Excel 6 and Excel 7 were skipped to correspond with the rest of Microsoft Office at the time. |
2000 | Excel 2000 | 9.0 | |
2001 | Excel 2001 | 10.0 | |
2004 | Excel 2004 | 11.0 | |
2008 | Excel 2008 | 12.0 | |
2011 | Excel 2011 | 14.0 | As with the Windows version, version 13 was skipped for superstitious reasons. |
2016 | Excel 2016 | 16.0 | As with the rest of Microsoft Office, so it is for Excel: Future release dates for the Macintosh version are intended to correspond better to those for the Windows version, from 2016 onward. |
Year | Name | Version | Comments |
---|---|---|---|
1989 | Excel 2.2 | 2.2 | Numbered in between Windows versions at the time |
1990 | Excel 2.3 | 2.3 | |
1991 | Excel 3 | 3.0 | Last OS/2 version. Discontinued subseries of Microsoft Excel, which is otherwise still an actively developed program. |
Impact
Excel offers many user interface tweaks over the earliest electronic spreadsheets; however, the essence remains the same as in the original spreadsheet software, VisiCalc: the program displays cells organized in rows and columns, and each cell may contain data or a formula, with relative or absolute references to other cells.
Excel 2.0 for Windows, which was modeled after its Mac GUI-based counterpart, indirectly expanded the installed base of the then-nascent Windows environment. Excel 2.0 was released a month before Windows 2.0, and the installed base of Windows was so low at that point in 1987 that Microsoft had to bundle a runtime version of Windows 1.0 with Excel 2.0.[117] Unlike Microsoft Word, there never was a DOS version of Excel.
Excel became the first spreadsheet to allow the user to define the appearance of spreadsheets (fonts, character attributes, and cell appearance). It also introduced intelligent cell re-computation, where only cells dependent on the cell being modified are updated (previous spreadsheet programs recomputed everything all the time or waited for a specific user command). Excel introduced auto-fill, the ability to drag and expand the selection box to automatically copy a cell or row contents to adjacent cells or rows, adjusting the copies intelligently by automatically incrementing cell references or contents. Excel also introduced extensive graphing capabilities.
Security
Because Excel is widely used, it has been attacked by hackers. While Excel is not directly exposed to the Internet, if an attacker can get a victim to open a file in Excel, and there is an appropriate security bug in Excel, then the attacker can gain control of the victim’s computer.[118] UK’s GCHQ has a tool named TORNADO ALLEY with this purpose.[119][120]
Games
Besides the easter eggs, numerous games have been created or recreated in Excel, such as Tetris, 2048, Scrabble, Yahtzee, Angry Birds, Pac-Man, Civilization, Monopoly, Battleship, Blackjack, Space Invaders, and others.[121][122][123][124][125]
In 2020, Excel became an esport with the advent of the Financial Modeling World Cup.[126]
See also
- Comparison of spreadsheet software
- Numbers (spreadsheet)—the iWork equivalent
- Spreadmart
- Financial Modeling World Cup, online esport financial modelling competition using Excel
References
- ^ «Update history for Microsoft Office 2019». Microsoft Docs. Retrieved April 13, 2021.
- ^ a b «C++ in MS Office». cppcon. July 17, 2014. Archived from the original on November 7, 2019. Retrieved June 25, 2019.
- ^ «Microsoft Office Excel 365». Microsoft.com. Retrieved January 25, 2021.
- ^ «Update history for Office for Mac». Microsoft Docs.
- ^ «Microsoft Excel APKs». APKMirror.
- ^ «Microsoft Excel». App Store.
- ^
Harvey, Greg (2006). Excel 2007 For Dummies (1st ed.). Wiley. ISBN 978-0-470-03737-9. - ^
Harvey, Greg (2007). Excel 2007 Workbook for Dummies (2nd ed.). Wiley. p. 296 ff. ISBN 978-0-470-16937-7. - ^
de Levie, Robert (2004). Advanced Excel for scientific data analysis. Oxford University Press. ISBN 978-0-19-515275-3. - ^
Bourg, David M. (2006). Excel scientific and engineering cookbook. O’Reilly. ISBN 978-0-596-00879-6. - ^
Şeref, Michelle M. H. & Ahuja, Ravindra K. (2008). «§4.2 A portfolio management and optimization spreadsheet DSS». In Burstein, Frad & Holsapple, Clyde W. (eds.). Handbook on Decision Support Systems 1: Basic Themes. Springer. ISBN 978-3-540-48712-8. - ^
Wells, Eric & Harshbarger, Steve (1997). Microsoft Excel 97 Developer’s Handbook. Microsoft Press. ISBN 978-1-57231-359-0. Excellent examples are developed that show just how applications can be designed. - ^
Harnett, Donald L. & Horrell, James F. (1998). Data, statistics, and decision models with Excel. Wiley. ISBN 978-0-471-13398-8. - ^
Some form of data acquisition hardware is required. See, for example, Austerlitz, Howard (2003). Data acquisition techniques using PCs (2nd ed.). Academic Press. p. 281 ff. ISBN 978-0-12-068377-2. - ^
«Description of the startup switches for Excel». Microsoft Help and Support. Microsoft Support. May 7, 2007. Retrieved December 14, 2010.Microsoft Excel accepts a number of optional switches that you can use to control how the program starts. This article lists the switches and provides a description of each switch.
{{cite web}}
: CS1 maint: url-status (link) - ^ «Excel functions (alphabetical)». microsoft.com. Microsoft. Retrieved November 4, 2018.
{{cite web}}
: CS1 maint: url-status (link) - ^ «WorksheetFunction Object (Excel)». Office VBA Reference. Microsoft. March 30, 2022. Retrieved November 4, 2018.
{{cite web}}
: CS1 maint: url-status (link) - ^ «Functions (Visual Basic for Applications)». Office VBA Reference. Microsoft. September 13, 2021. Retrieved November 4, 2018.
{{cite web}}
: CS1 maint: url-status (link) - ^ Gordon, Andy (January 25, 2021). «LAMBDA: The ultimate Excel worksheet function». microsoft.com. Microsoft. Retrieved April 23, 2021.
{{cite web}}
: CS1 maint: url-status (link) - ^
For example, by converting to Visual Basic the recipes in Press, William H. Press; Teukolsky, Saul A.; Vetterling, William T. & Flannery, Brian P. (2007). Numerical recipes: the art of scientific computing (3rd ed.). Cambridge University Press. ISBN 978-0-521-88068-8. Code conversion to Basic from Fortran probably is easier than from C++, so the 2nd edition (ISBN 0521437210) may be easier to use, or the Basic code implementation of the first edition: Sprott, Julien C. (1991). Numerical recipes: routines and examples in BASIC. Cambridge University Press. ISBN 978-0-521-40689-5. - ^ «Excel». Office for Mac. OfficeforMacHelp.com. Archived from the original on June 19, 2012. Retrieved July 8, 2012.
- ^ «Using Excel — PC or Mac? | Excel Lemon». www.excellemon.com. Archived from the original on September 21, 2016. Retrieved July 29, 2015.
- ^ However an increasing proportion of Excel functionality is not captured by the Macro Recorder leading to largely useless macros. Compatibility among multiple versions of Excel is also a downfall of this method. A macro recorder in Excel 2010 may not work in Excel 2003 or older. This is most common when changing colors and formatting of cells.
Walkenbach, John (2007). «Chapter 6: Using the Excel macro recorder». Excel 2007 VBA Programming for Dummies (Revised by Jan Karel Pieterse ed.). Wiley. p. 79 ff. ISBN 978-0-470-04674-6. - ^ Walkenbach, John (February 2, 2007). «Chapter 4: Introducing the Excel object model». cited work. p. 53 ff. ISBN 978-0-470-04674-6.
- ^ «The Spreadsheet Page for Excel Users and Developers». spreadsheetpage.com. J-Walk & Associates, Inc. Retrieved December 19, 2012.
- ^ «Working with Excel 4.0 macros». microsoft.com. Microsoft Office Support. Retrieved December 19, 2012.
- ^ «The «Big Grid» and Increased Limits in Excel 2007″. microsoft.com. May 23, 2014. Retrieved April 10, 2008.
{{cite web}}
: CS1 maint: url-status (link) - ^ «How to extract information from Office files by using Office file formats and schemas». microsoft.com. Microsoft. February 26, 2008. Retrieved November 10, 2008.
{{cite web}}
: CS1 maint: url-status (link) - ^ a b «XML Spreadsheet Reference». Microsoft Excel 2002 Technical Articles. MSDN. August 2001. Retrieved November 10, 2008.
- ^ «Deprecated features for Excel 2007». Microsoft—David Gainer. August 24, 2006. Retrieved January 2, 2009.
- ^ «OpenOffice.org’s documentation of the Microsoft Excel File Format» (PDF). August 2, 2008.
- ^ «Microsoft Office Excel 97 — 2007 Binary File Format Specification (*.xls 97-2007 format)». Microsoft Corporation. 2007.
- ^ Fairhurst, Danielle Stein (March 17, 2015). Using Excel for Business Analysis: A Guide to Financial Modelling Fundamentals. John Wiley & Sons. ISBN 978-1-119-06245-5.
- ^ a b Newcomer, Joseph M. «Faking DDE with Private Servers». Dr. Dobb’s.
- ^ Schmalz, Michael (2006). «Chapter 5: Using Access VBA to automate Excel». Integrating Excel and Access. O’Reilly Media, Inc. ISBN 978-0-596-00973-1.Schmalz, Michael (2006). «Chapter 5: Using Access VBA to automate Excel». Integrating Excel and Access. O’Reilly Media, Inc. ISBN 978-0-596-00973-1.
- ^ Cornell, Paul (2007). «Chapter 5: Connect to other databases». Excel as Your Database. Apress. p. 117 ff. ISBN 978-1-59059-751-4.
- ^ DeMarco, Jim (2008). «Excel’s data import tools». Pro Excel 2007 VBA. Apress. p. 43 ff. ISBN 978-1-59059-957-0.
- ^
Harts, Doug (2007). «Importing Access data into Excel 2007». Microsoft Office 2007 Business Intelligence: Reporting, Analysis, and Measurement from the Desktop. McGraw-Hill Professional. ISBN 978-0-07-149424-3. - ^ «About Network DDE — Win32 apps». learn.microsoft.com.
- ^ «How to set up and use the RTD function in Excel — Office». learn.microsoft.com.
- ^
DeMarco, Jim (2008). Pro Excel 2007 VBA. Berkeley, CA: Apress. p. 225. ISBN 978-1-59059-957-0.External data is accessed through a connection file, such as an Office Data Connection (ODC) file (.odc)
- ^
Bullen, Stephen; Bovey, Rob & Green, John (2009). Professional Excel Development (2nd ed.). Upper Saddle River, NJ: Addison-Wesley. p. 665. ISBN 978-0-321-50879-9.To create a robust solution, we always have to include some VBA code …
- ^ William, Wehrs (2000). «An Applied DSS Course Using Excel and VBA: IS and/or MS?» (PDF). The Proceedings of ISECON (Information System Educator Conference). p. 4. Archived from the original (PDF) on August 21, 2010. Retrieved February 5, 2010.
Microsoft Query is a data retrieval tool (i.e. ODBC browser) that can be employed within Excel 97. It allows a user to create and save queries on external relational databases for which an ODBC driver is available.
- ^ Use Microsoft Query to retrieve external data Archived March 12, 2010, at the Wayback Machine
- ^ «Password protect documents, workbooks, and presentations — Word — Office.com». Office.microsoft.com. Retrieved April 24, 2013.
- ^ «Password protect documents, workbooks, and presentations — Word — Office.com». Office.microsoft.com. Retrieved April 24, 2013.
- ^ «Password protect worksheet or workbook elements — Excel — Office.com». Office.microsoft.com. Archived from the original on March 26, 2013. Retrieved April 24, 2013.
- ^ Ralph, Nate. «Office for Windows Phone 8: Your handy starter guide». TechHive. Archived from the original on October 15, 2014. Retrieved August 30, 2014.
- ^ Wollman, Dana. «Microsoft Office Mobile for iPhone hands-on». Engadget. Retrieved August 30, 2014.
- ^ Pogue, David. «Microsoft Adds Office for iPhone. Yawn». The New York Times. Retrieved August 30, 2014.
- ^ a b Ogasawara, Todd. «What’s New in Excel Mobile». Microsoft. Archived from the original on February 8, 2008. Retrieved September 13, 2007.
- ^ a b «Unsupported features in Excel Mobile». Microsoft. Archived from the original on October 20, 2007. Retrieved September 21, 2007.
- ^ Use Excel Mobile Archived October 20, 2007, at the Wayback Machine. Microsoft. Retrieved September 21, 2007.
- ^ «Excel Mobile». Windows Store. Microsoft. Retrieved June 26, 2016.
- ^ «PowerPoint Mobile». Windows Store. Microsoft. Retrieved June 26, 2016.
- ^ «Differences between using a workbook in the browser and in Excel — Office Support». support.office.com. Archived from the original on 8 February 2017. Retrieved 7 February 2017.
- ^ «Description of the Excel Viewer». Microsoft. February 17, 2012. Archived from the original on April 6, 2013.
- ^ a b «How to obtain the latest Excel Viewer». Microsoft Docs. May 22, 2020. Retrieved January 3, 2021.
- ^ «Description of the security update for Excel Viewer 2007: February 12, 2019». Microsoft. April 16, 2020. Retrieved January 3, 2021.
- ^ «Microsoft Excel Viewer». Microsoft. 1997. Archived from the original on January 20, 1998.
- ^ «Excel 97/2000 Viewer: Spreadsheet Files». Microsoft. Archived from the original on January 13, 2004.
- ^ «New Features in Windows CE .NET 4.1». Microsoft Docs. June 30, 2006. Retrieved January 3, 2021.
- ^ «Excel Viewer 2003». Microsoft. October 12, 2004. Archived from the original on January 15, 2005.
- ^ «Excel Viewer 2003 Service Pack 3 (SP3)». Microsoft. September 17, 2007. Archived from the original on October 11, 2007.
- ^ «Excel Viewer». Microsoft. January 14, 2008. Archived from the original on September 26, 2010.
- ^ «Excel Viewer 2007 Service Pack 2 (SP2)». Microsoft. April 24, 2009. Archived from the original on April 28, 2012.
- ^ «Excel Viewer 2007 Service Pack 3 (SP3)». Microsoft. October 25, 2011. Archived from the original on December 29, 2011.
- ^ «Supported versions of the Office viewers». Microsoft. April 16, 2020. Retrieved January 3, 2021.
- ^
Microsoft’s overview is found at: «Floating-point arithmetic may give inaccurate results in Excel». Revision 8.2 ; article ID: 78113. Microsoft support. June 30, 2010. Retrieved July 2, 2010. - ^
Altman, Micah; Gill, Jeff; McDonald, Michael (2004). «§2.1.1 Revealing example: Computing the coefficient standard deviation». Numerical issues in statistical computing for the social scientist. Wiley-IEEE. p. 12. ISBN 978-0-471-23633-7. - ^ de Levie, Robert (2004). cited work. pp. 45–46. ISBN 978-0-19-515275-3.
- ^
Walkenbach, John (2010). «Defining data types». Excel 2010 Power Programming with VBA. Wiley. pp. 198 ff and Table 8–1. ISBN 978-0-470-47535-5. - ^ McCullough, Bruce D.; Wilson, Berry (2002). «On the accuracy of statistical procedures in Microsoft Excel 2000 and Excel XP». Computational Statistics & Data Analysis. 40 (4): 713–721. doi:10.1016/S0167-9473(02)00095-6.
- ^ McCullough, Bruce D.; Heiser, David A. (2008). «On the accuracy of statistical procedures in Microsoft Excel 2007». Computational Statistics & Data Analysis. 52 (10): 4570–4578. CiteSeerX 10.1.1.455.5508. doi:10.1016/j.csda.2008.03.004.
- ^ Yalta, A. Talha (2008). «The accuracy of statistical distributions in Microsoft Excel 2007». Computational Statistics & Data Analysis. 52 (10): 4579–4586. doi:10.1016/j.csda.2008.03.005.
- ^ Goldwater, Eva. «Using Excel for Statistical Data Analysis—Caveats». University of Massachusetts School of Public Health. Retrieved November 10, 2008.
- ^
Heiser, David A. (2008). «Microsoft Excel 2000, 2003 and 2007 faults, problems, workarounds and fixes». Archived from the original on April 18, 2010. Retrieved April 8, 2010. - ^
Function improvements in Excel 2010 Archived April 6, 2010, at the Wayback Machine Comments are provided from readers that may illuminate some remaining problems. - ^ «The MOD bug». Byg Software. Archived from the original on January 11, 2016. Retrieved November 10, 2008.
- ^ «Days of the week before March 1, 1900 are incorrect in Excel». Microsoft. Archived from the original on July 14, 2012. Retrieved November 10, 2008.
- ^ «Excel incorrectly assumes that the year 1900 is a leap year». Microsoft. Retrieved May 1, 2019.
- ^ Spolsky, Joel (June 16, 2006). «My First BillG Review». Joel on Software. Retrieved November 10, 2008.
- ^ «The Contradictory Nature of OOXML». ConsortiumInfo.org. January 17, 2007.
- ^ «Negative date and time value are displayed as pound signs (###) in Excel». Microsoft. Retrieved March 26, 2012.
- ^ Zeeberg, Barry R; Riss, Joseph; Kane, David W; Bussey, Kimberly J; Uchio, Edward; Linehan, W Marston; Barrett, J Carl; Weinstein, John N (2004). «Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics». BMC Bioinformatics. 5 (1): 80. doi:10.1186/1471-2105-5-80. PMC 459209. PMID 15214961.
- ^ a b Ziemann, Mark; Eren, Yotam; El-Osta, Assam (2016). «Gene name errors are widespread in the scientific literature». Genome Biology. 17 (1): 177. doi:10.1186/s13059-016-1044-7. PMC 4994289. PMID 27552985.
- ^ Anon (2016). «Microsoft Excel blamed for gene study errors». bbc.co.uk. London: BBC News.
- ^ Cimpanu, Catalin (August 24, 2016). «One in Five Scientific Papers on Genes Contains Errors Because of Excel». Softpedia. SoftNews.
- ^ Ziemann, Mark (2016). «Genome Spot: My personal thoughts on gene name errors». genomespot.blogspot.co.uk. Archived from the original on August 30, 2016.
- ^ Vincent, James (August 6, 2020). «Scientists rename human genes to stop Microsoft Excel from misreading them as dates». The Verge. Retrieved October 9, 2020.
- ^ «Excel: type() and
WorksheetFunction.IsText()
fail for long strings». Stack Overflow. November 3, 2018. - ^ Rajah, Gary (August 2, 2004). «Trouble with macros». The Hindu Business Line. Retrieved March 19, 2019.
- ^ Chirilov, Joseph (January 8, 2009). «Microsoft Excel — Why Can’t I Open Two Files With the Same Name?». MSDN Blogs. Microsoft Corporation. Archived from the original on July 29, 2010. Retrieved March 19, 2019.
- ^ Infoworld Media Group, Inc. (July 7, 1986). InfoWorld First Look: Supercalc 4 challenging 1-2-3 with new tactic.
- ^ «The History of Microsoft — 1987». channel9.msdn.com. Archived from the original on September 27, 2010. Retrieved October 7, 2022.
- ^ «The History of Microsoft — 1987». learn.microsoft.com. Retrieved October 7, 2022.
- ^ a b Walkenbach, John (December 4, 2013). «Excel Version History». The Spreadsheet Page. John Walkenbach. Retrieved July 12, 2020.
- ^ Lewallen, Dale (1992). PC/Computing guide to Excel 4.0 for Windows. Ziff Davis. p. 13. ISBN 9781562760489. Retrieved July 27, 2013.
- ^ Lake, Matt (April 6, 2009). «Easter Eggs we have loved: Excel 4». crashreboot.blogspot.com. Retrieved November 5, 2013.
- ^ Osterman, Larry (October 21, 2005). «Why no Easter Eggs?». Larry Osterman’s WebLog. MSDN Blogs. Retrieved July 29, 2006.
- ^ «Excel 95 Hall of Tortured Souls». Retrieved July 7, 2006.
- ^ «Excel Oddities: Easter Eggs». Archived from the original on August 21, 2006. Retrieved August 10, 2006.
- ^ «Car Game In Ms Excel». Totalchoicehosting.com. September 6, 2005. Retrieved January 28, 2014.
- ^ Dostál, M (December 9, 2010). User Acceptance of the Microsoft Ribbon User Interface (PDF). Palacký University of Olomouc. ISBN 978-960-474-245-5. ISSN 1792-6157. Retrieved May 28, 2013.
- ^ [Using Excel Tables to
Manipulate Billing Data https://mooresolutionsinc.com/downloads/Billing_MJ12.pdf] - ^ Dodge, Mark; Stinson, Craig (2007). «Chapter 1: What’s new in Microsoft Office Excel 2007». Microsoft Office Excel 2007 inside out. Microsoft Press. p. 1 ff. ISBN 978-0-7356-2321-7.
- ^ «What’s New in Excel 2010 — Excel». Archived from the original on December 2, 2013. Retrieved September 23, 2010.
- ^ Walkenbach, John (2010). «Some Essential Background». Excel 2010 Power Programming with VBA. Indianapolis, Indiana: Wiley Publishing, Inc. p. 20. ISBN 9780470475355.
- ^ Harris, Steven (October 1, 2013). «Excel 2013 — Flash Fill». Experts-Exchange.com. Experts Exchange. Retrieved November 23, 2013.
- ^ «What’s new in Excel 2013». Office.com. Microsoft. Retrieved January 25, 2014.
- ^ K., Gasper (October 10, 2013). «Does a PowerPivot Pivot Table beat a regular Pivot Table». Experts-Exchange.com. Experts Exchange. Retrieved November 23, 2013.
- ^ K., Gasper (May 20, 2013). «Inquire Add-In for Excel 2013». Experts-Exchange.com. Experts Exchange. Retrieved November 23, 2013.
- ^ «New functions in Excel 2013». Office.com. Microsoft. Retrieved November 23, 2013.
- ^ «What’s new in Office 2016». Office.com. Microsoft. Retrieved August 16, 2015.
- ^ «Microsoft Announces March Availability of Office 98 Macintosh Edition». Microsoft. January 6, 1998. Retrieved December 29, 2017.
- ^ «Office for Mac Is Finally a ‘First-Class Citizen’«. Re/code. July 16, 2015. Retrieved July 29, 2015.
- ^ Perton, Marc (November 20, 2005). «Windows at 20: 20 things you didn’t know about Windows 1.0». switched.com. Archived from the original on April 11, 2013. Retrieved August 1, 2013.
- ^ Keizer, Gregg (February 24, 2009). «Attackers exploit unpatched Excel vulnerability». Computerworld. IDG Communications, Inc. Retrieved March 19, 2019.
- ^ «JTRIG Tools and Techniques». The Intercept. First Look Productions, Inc. July 14, 2014. Archived from the original on July 14, 2014. Retrieved March 19, 2019.
- ^ Cook, John. «JTRIG Tools and Techniques». The Intercept. p. 4. Retrieved March 19, 2019 – via DocumentCloud.
- ^ Phillips, Gavin (December 11, 2015). «8 Legendary Games Recreated in Microsoft Excel». MUO.
- ^ «Excel Games – Fun Things to Do With Spreadsheets». November 10, 2021.
- ^ «Unusual Uses of Excel». techcommunity.microsoft.com. August 5, 2022.
- ^ «Someone made a version of ‘Civilization’ that runs in Microsoft Excel». Engadget.
- ^ Dalgleish, Debra. «Have Fun Playing Games in Excel». Contextures Excel Tips.
- ^ «Microsoft Excel esports is real and it already has an international tournament». ONE Esports. June 9, 2021.
References
- Bullen, Stephen; Bovey, Rob; Green, John (2009). Professional Excel Development: The Definitive Guide to Developing Applications Using Microsoft Excel and VBA (2nd ed.). Boston: Addison Wesley. ISBN 978-0-321-50879-9.
- Dodge, Mark; Stinson, Craig (2007). Microsoft Office Excel 2007 Inside Out. Redmond, Wash.: Microsoft Press. ISBN 978-0-7356-2321-7.
- Billo, E. Joseph (2011). Excel for Chemists: A Comprehensive Guide (3rd ed.). Hoboken, N.J.: John Wiley & Sons. ISBN 978-0-470-38123-6.
- Gordon, Andy (January 25, 2021). «LAMBDA: The ultimate Excel worksheet function». microsoft.com. Microsoft. Retrieved April 23, 2021.
External links
Wikibooks has a book on the topic of: Excel
- Microsoft Excel – official site
One of the biggest surprises I’ve gotten in a while came when I read a statement by Microsoft addressing Microsoft Excel as a programming language, they recently announced that Lambda (which is more known with backend programming languages) will incorporated into Microsoft excel thereby fulfilling all the criteria required for a tool to be considered a programming language.
We will be having a lot of excel programmers among us very soon. I did a deeper study into excel and realized that most people are barely scratching the surface of what Microsoft excel can do, there are a lot of similar terminologies between Microsoft excel and programming languages that would shock the average programmer.
Data analysis and software engineering have been perceived as cousins, with this new development, we might as well they’re siblings, this might be good for companies but I fear some companies may begin to demand that software engineers wear multiple hats. I do believe that the closer data analysis gets to software engineering the better the synergy that can be gotten.
Excel as code
Excel is one of the most widely used software products in the entire
world. Wordprocessors have more users to be sure, but, Excel is
nothing like a word processor. It is in reality a programming
language and database combined.
Not counting Excel users, there are only about 30 million
programmers. Estimates put the number of Excel users between 500m and
over 1 billion!
It is therefore, by far, the most used programming language on the
planet. It is easily 20 times more popular than the next contender.
Excels are running the core of a huge number of business functions
from budgeting, product management, customer accounts, and many many
other things besides.
The value of Excel is that it is presenting the data, with a set of
formulae that let you keep derived data up-to-date. This inferred
data provides sums and computations, sometimes simple, but sometimes
exquisitely complex.
And through this whole range of complexity, with half a billion users,
virtually nobody treats Excel seriously like a programming language.
How can this be? We have a programming language which is essentially
acting as a declarative database, and yet we don’t do unit tests, we
don’t keep track of changes, we collaborate with Excel by sending it
to our colleagues in the mail and god-forbid we should doing any
serious linting of what is in the thing.
This is a really crazy situation.
The programmers and database managers will often look at this
situation in terror and tell excel-jockeys they need to get off excel
ASAP.
The excel-jockeys might look at the database nerds and IT geeks and
think that they must be off their rocker. Or maybe they even feel
ashamed but realise that there is no way they are going to be able to
do the their job properly by simply switching to using Oracle &
Python.
Of course anyone who has used Excel in anger realises why it is so
brilliant. Show me another declarative constraint based, data driven
inferrence language that I can teach to my grandmother and I’ll eat my
hat!
People refuse to stop using Excel because it empowers them and they
simply don’t want to be disempowered.
And right they are. The problem isn’t Excel. The problem is that we
are treating Excel like its a word processor, and not what it is: a
programming language.
The Programming Enlightement
In the dark ages of programming you had a source tree and you edited
files in some terrible text editor and then ran a compiler. Some time
later you’d have a binary that you’d run and see if it crashed. If
everything went well you might share the file on a file server with
your colleagues. They also changed it so you had to figure out how not
to break everything and paste their changes back into your source tree
(or vice versa).
This was clearly a disaster, leading to huge pain in getting the
source code merges to line up without failure.
Enter revision control.
People realised that there needed to be a system of checking files in
and out such that changes could be compared and collisions could be
avoided.
And never did the person have to leave programming in their favourite
editor. Nobody told them to store their code in Oracle. Nobody said
they should share their source code in Google Docs.
This enabled vast improvements in collaboration. Fearless editing of
files created a much more open development environment. You could go
ahead and make that change you knew had to cut across half of the code
because you could figure out how to merge it when the time came. The
number of programmers you could have working on a code base with much
lower communication overhead increased tremendously.
The revision control system enabled a completely new approach to
software development: Continuous Integration / Continuous Deployment
(CI/CD). CI/CD meant that when code was checked in, a series of hooks
that ran unit tests could be run. Linters could be run over the
checked in version. You could even have complex integration tests
running which checked if the software still worked properly with other
processes.
All of these checks meant that the health of the code could be known
up to the minute. It was still possible to introduce breaking changes
by messing something up in a clever way, but a huge class of errors
was removed.
How Excel can join the Renaissance
Unfortunately, none of this applies to Excel because Excel doesn’t
work well with revision control.
Why?
Because Excel is not a source file. It is a database coupled with
code. Git was not built for this — it knows about lines in a file and
thats it. Good luck trying to use git to resolve merge conflicts — it
will simply butcher your file.
The path to enlightement is a more sophisticated revision control
systems — ones that can understand Excel.
Luckily such a thing does actually exist,
VersionXL.
Collaboration
The first benefit to this new approach to putting Excel in version
control will be enabling collaboration. Sure you can send Excel files
to people, but this is the equivalent of me e-mailing my colleague my
source tree every time I want to make a change.
And if I share it with two people at once, I’m sure to end up with two
different changes. And now I must figure out how to incorporate
both. I’ve turned myself into a fault-prone (and probably very
expensive) revision control system. And if I make a mistake I’ll be
digging through my e-mail looking for the one I sent to the first
person in order to merge the correct changes back in again.
Out of the traps we are winning whenever there is a collaboration —
even between two people. We get to merge with less hassle, and any
mistake is just a rollback.
And at no point did we have to leave Excel.
CI/CD for Excel
Now that we have a revision control system for Excel, we can start to
think seriously about CI/CD and what it would mean to really treat
Excel as code in a modern development environment.
First off is linting. Linting just means writing queries or scripts
which can look for obvious syntactic bugs. The value of this can not
be overstated. The number of stupid and obvious syntactic bugs (such
as mispellings) that even incredibly intelligent programers make is
huge. And the value of noticing that even larger.
What would Excel linting look like? It could be as simple as saying:
All currency values in this file should be in dollars
Or maybe it says:
Cells in column C must be numeric.
But it could be that specific files would require custom and complex
linting. That’s fine, that happens with code too! You should be able
to simply at it as a test hook on commit. Once you get the green
light, you know that it’s safe to merge.
In large corporations or organisations its often the case that you’ll
even want aspects of the layout, the number of sheets etc. to remain
uniform even after updates. Linting can enable this to happen.
Of course linting doesn’t catch more complex semantic errors. For that
we often want to write down what we expect some formula to do. And
to test that we should have a test case for our formula. This is unit
testing.
Unit testing excel might mean ensuring certain formulae meet a set of
external assertions that ensure that they still «do the right thing».
The value of having these external verifications might not seem
obvious when you’re calculating a total, but if the calculation is
very complex you probably want to have a few test cases (which might
not necessarily be in your workbook) to sanity test.
And the more important the value of the calculations, the more
sanity should prevail.
Conclusion
Excel is a programming language. It’s time we start treating it like
one. Excel users want to keep using the power of their favourite
language.
They don’t need to change that.
What needs to change is the idea that they are not programmers, so
they can join us in using modern software practices.
Episode 120 | May 5, 2021
Today, people around the globe—from teachers to small-business owners to finance executives—use Microsoft Excel to make sense of the information that occupies their respective worlds, and whether they realize it or not, in doing so, they’re taking on the role of programmer.
In this episode, Senior Principal Research Manager Andy Gordon, who leads the Calc Intelligence team at Microsoft Research, and Senior Principal Researcher Simon Peyton Jones provide an inside account of the journey Excel has taken as a programming language, including the expansion of data types that has unlocked greater functionality and the release of the LAMBDA function, which makes the Excel formula language Turing-complete. They’ll talk specifically about how research has influenced Excel and vice versa, programming as a human-computer interaction challenge, and a future in which Excel is the first language for budding programmers and a tool for incorporating probabilistic reasoning into our decision-making.
Learn more:
- Excel Blog: “Announcing LAMBDA: Turn Excel formulas into custom functions”
- Microsoft Research Blog: “LAMBDA: The ultimate Excel worksheet function”
- Research Collection: “Innovation by (and beyond) the numbers: A history of research collaborations in Excel”
Subscribe to the Microsoft Research Podcast:
iTunes | Email | Android | Spotify | RSS feed | Microsoft Research Newsletter
Transcript
TEASER (SIMON PEYTON JONES): I don’t think I ever dreamt that we could offer something as exotic as LAMBDA. You could really write literally any program in Excel now. Certainly, it becomes computationally much more powerful.
[MUSIC BREAK]
ANDY GORDON: Welcome to the Microsoft Research Podcast.
[MUSIC BREAK]
GORDON: My name is Andy Gordon. The new LAMBDA function has been announced in Excel, and I’m here to tell you about that with my colleague Simon Peyton Jones. Simon, would you like to introduce yourself?
SIMON PEYTON JONES: Yeah, so, I’ve been here at Microsoft Research for 22 years, since 1998. I did my undergraduate degree at Cambridge. I worked for a few years in a small company, and then I worked at University College London and Glasgow University as a professor before moving to Microsoft. And my research interest has always been in functional programming, purely functional programming, as a radical and elegant attack on the entire enterprise of writing programs. But what about you, Andy? How did you get into functional programming and, indeed, Excel?
GORDON: Well, I was doing a PhD on lazy functional programming back in the late ’80s. In fact, when you guys were starting Haskell, my PhD was on input-output for Haskell, and I was actually delighted you personally invited me to sit on the Haskell committee to help standardize input-output using monads. And I joined Microsoft actually in ’97, so I was one of the first employees at Microsoft Research Cambridge.
PEYTON JONES: That’s right. You narrowly predated me. [LAUGHTER]
GORDON: Not just that. I interviewed you. That was a tough decision. [LAUGHTER]
PEYTON JONES: Well, you made a good call. [LAUGHTER]
GORDON: I think mostly. Yeah, so, about my research. Yeah, I started out in functional programming, and I’ve done a bunch of other things in, like, security. And then about 10 years, I got into probabilistic programming for machine learning, which funny enough, led me to spreadsheets because we built a system to support writing probabilistic programs of data within spreadsheets. And at that point, I realized that to take things much further, it would be good to join forces with your work with the Excel team. And in fact, you were involved with the Excel team pretty early on in your time at Microsoft. Do you want to tell us about that?
PEYTON JONES: Yeah, pretty early, because when I first joined Microsoft, I thought to myself, “What can I do that would advance the course of functional programming within Microsoft?” And then I thought, “Well, Excel is the world’s most widely used functional programming language.” It’s not a very powerful one, perhaps, but when you write a formula in a spreadsheet, you are writing in a purely functional language. So, no side effects. You don’t say, you know, “=print3+7”; it wouldn’t make any sense. And moreover, it’s extremely widely used, so there must be a hundred times as many users of formulae in Excel as there are professional programmers in the entire planet. And so, I was thinking maybe we should think about what would it take to start from Excel, but to grow it using the North Star of mainstream functional programming languages and try to see how much more powerful we could make it. Then I started talking to research colleagues, like Alan Blackwell here at Cambridge and Margaret Burnett in Oregon, and we started to come up with some ideas about defining functions of Excel. We had two principal things. The first was, Excel provides 600-odd built-in functions, but it doesn’t provide you with a way to make a new function out of existing ones. So, in every other programming language, you can define a procedure, perhaps, by writing some code, wrapping it up, giving it a name and some parameters, and then you can call it repeatedly. Not so in Excel. If you want to do that, you have to write your procedure in VBA or JavaScript or something, and that’s kind of, like, crazy. We’d like you to be able to define new functions using the existing formula language. Every other programming language lets you do that. Why not Excel? So, we came up with a design for doing that, and we turned it into a research paper. It was published in ICFP in about 2002 or ’03. It was called “User-defined functions in Excel.” So then, we started to realize that it wasn’t just enough to have user-defined functions. They needed to be able to take structured data. It wasn’t enough for them to take scalar data. And this is a part which you’ve been much more involved in since the need for rich data in Excel, which sort of complements the need for user-defined functions. So, maybe you could tell us a bit about that.
GORDON: Yeah. I mean, you’re right, Simon. I mean, until pretty recently, the only kind of data that you could have in Excel was text or numbers. That’s all you could have sitting in cells, and so, if you’re trying to turn the formula language into a proper functional programming language, you really need a lot more structures like that. There’s only so many functions you can write that just take two scalars as input and produce another one as an output. There’s not many of those. So, for example, you need a raise.
PEYTON JONES: You mean as arguments and results of functions?
GORDON: Yes. You want to be able to process big pieces of data in one go, and it’s also important to be able to store them in cells. You know, within a sheet-defined function, you might want to have a computation and have it sort of spread out within the grid and then return a whole array, say, as the result of a sheet-defined function. So, generally, we need first-class arrays in the language.
PEYTON JONES: Yeah, we should pause a bit just to explain about what is a sheet-defined function. The idea that we put in this paper for defining new functions in Excel was to say imagine you could take a worksheet and nominate certain cells as the input and one cell as the output and give it a name. And then when you call a function with that name, when you call that function, it is as if you had created a fresh copy of that worksheet, filled in the input values, the arguments of the function in those input cells, calc’d the worksheet, and taken the result out. That was our model for defining a new function. Now, if you want to define a function that works over arrays, for example, “Sort this array” or “Pick the smallest element of it,” then you can’t just take an array of a fixed size, like a three-element array or a 30-element array. We’d like to have an array of arbitrary size, which presumably would then land in a single cell in the worksheet that defines the function.
GORDON: This is a great vision, and you wrote a paper about that. So, how do things develop, Simon?
PEYTON JONES: Well, then, so having got the idea, I then would travel to Redmond, the Microsoft mothership, on a fairly regular basis, you know, two or three times a year. So I started to become very familiar with the Excel team, and I discovered that they were very warm towards the idea—we had lots of interesting technical conversations about how Excel might work—but Excel is a very big product, and it doesn’t move quickly. It’s like a sort of supertanker. So, for a long time, I felt a bit like a fly bashing against the bow of the supertanker, and there was always a good reason why now is not a good time to do it, and no criticism for that. There were other priorities, and there still are. But what actually happened in the end was that I got a bit discouraged, and I went away for about 10 years. I forget what provoked me to . Maybe it was our lab director, Andrew Blake, who said, “You should really have another go at this.” But it turned out it matched up with their priorities, so we then started a much more active partnership between Microsoft Research Cambridge and Excel on this whole idea of sheet-defined functions and data. But things came out in a very different order than I anticipated. I had started on the sheet-defined function idea, but in fact, the first thing that happened was actually to do with data types and rich structured data involved in functions. In particular, the first thing that came out as a change to the product was called dynamic arrays, and you were quite involved in that, weren’t you? Tell us a bit about it.
GORDON: So, dynamic arrays are really interesting because if you go way back, you discover that the formula language in Excel could in fact construct arrays. You could refer to bits of grid, a range, and then pull that into a formula and compute with it, but what you couldn’t do was return the whole array in one go.
PEYTON JONES: You could do things like add two arrays together or add 1 to every element in an array. You could treat arrays as first-class values within a formula.
GORDON: Right, and so you could lift formulas over whole arrays, so, like, plus 1 and so on, or you could lift over the whole array and you could get the result, but you could only have a scalar in a cell, and so, it was very awkward to return the whole array. There was a notion of a thing called array formulas, and if you were on Windows, the way you entered them was to do control-shift-enter, I think after you’d [LAUGHTER]—yeah, you’d type the formula in, and then you had to remember to go control-shift-enter. It was very complicated. You would select the range, you know, multiple cells that were to hold the array, and you needed to know in advance how big it was, and then you would type in the formula and end it with control-shift-enter, and then that one formula would define the whole array. So, it was really cumbersome.
PEYTON JONES: [LAUGHTER] It’s sort of hilarious, really, when you think about it. Curly braces would appear around the formula.
GORDON: So, it was this advanced feature, and there were a few people who knew how to use it. In fact, there was a person who actually wrote a book called Ctrl+Shift+Enter, which was this sort of guidebook to how you could sort of use this feature, but it was very difficult to use. But now, I think since about 2018, we’ve had dynamic arrays. And this is an amazing feature. You don’t have this separate world of array formulas. Ordinary formulas can return arrays, and you don’t have to decide in advance how big the array is going to be. Instead, you just write the formula, and if it computes an array, then we see it spills out from the cell. Maybe you write a formula that, you know, is a 3 × 3 array, and it will just spill out from—you write the formula in the top left corner, and then the rest of the array will fill out. And then if you want to have another calculation somewhere else—if the original array was in cell A1, then if you say “A1 sharp” then that expression, that formula, A1 sharp, refers to the whole array, and then you can compute to it. You can add that whole array to another one or take the sum of it or take the average of it.
PEYTON JONES: So there you don’t need to know how big the array in A1 sharp is, and indeed, in different calc cycles, it might be different sizes.
GORDON: Right.
PEYTON JONES: That’s really important; the chain of formulae will work regardless of the sizes of the arrays involved, which may vary with the data.
GORDON: When dynamic arrays came out, we didn’t have LAMBDAs, so I think for everyone in the team, we knew that LAMBDAs were the thing and combinators together with LAMBDAs like map and reduce and fold and scan—these kinds of combinators that are familiar from functional languages like Haskell and others—we knew that those would be really important for programming with arrays, but when they first came out, we didn’t have those. So, we’ve been struggling a little bit to really get the full power of dynamic arrays. There’s various tricks you can use to construct dynamic arrays, but certain kind of common patterns are still quite hard to use, which changes with LAMBDAs coming in.
PEYTON JONES: Dynamic arrays still produced a lot of excitement at the time because they really unlocked a huge amount of functionality. It was only previously available through this very arcane control-shift-enter mechanism, and then only if you knew the size of the array. So, they’re dynamic. They really work over a variable size. It wasn’t just arrays, either. So, arrays are a very natural data type for Excel, but we also talked to our colleagues in the product team a lot about records, and that came out at a similar kind of time, as well, didn’t it?
GORDON: Yes. So, these are known as entities in Excel, and you can get them from the data tab in the ribbon, and the funny thing is that they are records, but there isn’t actually a function to create a record from scratch. Instead, the way it’s been introduced is that Excel has got this intelligence feature that can recognize certain names of entities that occur in the workbook. So, if you refer to the name of a company or if you refer to a name—say, Cambridge—Excel can recognize Cambridge as a geographic entity, like the city of Cambridge in the UK, where Simon and I are at the moment, and what Excel will do is go out to Bing’s database, fetch facts about the city of Cambridge, wrap them up in a record, and then drop that into the cell, and if it’s the name of a company, then it would also grab facts like the current stock price or the name of the CEO and drop that into the grid.
PEYTON JONES: You have to do something to make it do that, don’t you? It doesn’t just spot—anytime you say “Cambridge,” it doesn’t just go—yeah, there’s some gesture.
GORDON: You go into the data ribbon, and actually, one of the really cool things about those entities is they’re known as linked entities. It’s not just a record, but they also have this link to where they came from, sort of a pointer to the web, if you like. So, it means that if you open up a stock data type in a workbook, Excel will actually automatically refresh properties like the price, and then it would recalc anything that depends on that. So, these have become really popular as ways of building financial models or getting properties of geographies.
PEYTON JONES: But from a programming point of view, the interesting thing about this was the value in the cell was no longer a scalar. It really is a record, and you could programmatically say “A1.price” to extract the field. That’s the first time that kind of rich, structured data of nested records was available.
GORDON: Yeah, I think that’s the first time Excel had “dot” actually as an operator. So, we now have, like, “dot,” like in many other programming languages, with IntelliSense, so it kind of gives you an option for different fields you might want to pull out of the record.
PEYTON JONES: It was a funny dynamic that, for a programming languages person, you’d say, “Well, we need an introduction form, which lets you create records, and then an elimination form that lets you take them apart”—that‘s the dot. But the way it came out in the product is we have the elim form but without the intro form. I think we all get an intro form, but it was just an interesting way in which working with a product group makes things come out in a different way, probably a way that’s more useful for users in the first instance than we would have done as academics. I thought that dynamic was interesting.
GORDON: Yeah, actually, I mean, stepping back, Simon, I think this is one of the really big things that I’ve learned from working with the Excel team—I mean, we’ve been working really intensely on this the last five years—which is the kind of people who use Excel. I mean, it’s a really broad population—it’s a super popular application—but the kind of people who use it are generally not programmers, people who really care about writing code for its own sake. Like most people who are sort of functional programmers, say, Haskell programmers, are really passionate about programming, whereas people who are using Excel, not always, but generally, they are really keen to get some other job done, like they’re an accountant, they’re someone working in finance, maybe a teacher tracking statistics about how their students are doing, and they’re what are known as end-user programmers. They are writing code—they’re writing programs—but really for their own purposes. So, when the Excel team thinks about how best to serve end-user programmers, they often do things a little differently than if you were trying to produce a feature for a very sophisticated programmer. Entities are a good example of that, where the Excel team realized that end users who maybe cared about stocks would just want a very convenient way to go from the name of a company to an entity that would automatically update the stock price rather than having to have maybe a function called “record” that they would call and have to put in arguments to determine which entity to fetch, and I think that’s been quite successful, that end users have really been able to use these features, and they don’t really know what a record is, but they can understand that particular entities matter for what they’re trying to do.
PEYTON JONES: We also laid a lot of answers on compositionality. I think for the first time, we had not just data that wasn’t a scalar, but data that was arbitrarily nested. So, a record for a country might contain a field that is the states of that country, and that’s an array, an array of states. So, it’s a record that contains an array, and that array contains entities themselves that are descriptors, records about the state. So, you get the sort of compositional, deeply nested form of structures that we’re familiar with in programming languages, but which is really not in Excel at all until this point.
GORDON: Yeah, it’s really rich. For example, Microsoft has done this amazing deal with Wolfram, the company that does the WolframAlpha knowledge base, and so Microsoft is giving access to a huge range of data types from Wolfram, like chemical compounds and stars and different food items with their calories and so forth, so there’s a great repertoire of data and a lot of it is nested in the way that you’re talking about.
PEYTON JONES: We started to go further than that, even. After we got arrays and records—those are like two data types—you might think, “Well, what if we want more?” So, we started to think, “What would it mean to make Excel extensible with respect to its data types?” Maybe you could walk up to it as a third-party developer—now not as an end user, but as a real programmer—and plug in something that adds another data type to Excel, let’s say images, right? So, you might have an abstract data type of images with operations you provide that can overlay them or blur them or rotate them, and then you have a little functional programming language whose data types are images, and Excel serves as a sort of orchestration language that lets you write formally they’re connected, but the new data type is provided as a third-party plug-in. That’s not available yet. It’s part of our—you know, your and my vision—and I think the team in Redmond takes it pretty seriously, and we’ve designed the implementation to accommodate that, but we’re starting with some fixed data types. But the idea that ultimately, we could turn Excel into an extensible platform for arbitrary data types that do really quite big, remarkable things in a financial contract, say, is quite an exciting one, I think.
GORDON: Yeah, that’s gonna be really cool because we’ve seen that customers often have used things like previous extensibility features like VBA to in a way sort of hack up some of these features for representing things like financial contracts or financial instruments and sort of simulate things like arrays and records using strings and various other tricks like that. But once this feature comes out, it’ll be possible for third-party developers to sort of really enrich the product with new data types. It’ll be extremely cool.
[MUSIC BREAK]
PEYTON JONES: So, we talked quite a bit about the way in which our partnership has influenced the product, but the reverse has also happened. In fact, we’ve used the credibility with the lab of this very effective product group partnership to justify growing our research program into a group that we call the Calc Intelligence group, which you run now. Maybe you could just tell us a few examples of the broader research-based things we’ve been doing on the back of all this.
GORDON: Yeah, we’ve done a range of different things, and we’ve got papers out on different topics. So, one example is what we call elastic SDFs. So, Simon, you’d explained the idea of sheet-defined functions, that we can describe a calculation by a piece of grid with some of the cells being inputs and some of them being outputs, and I think one thing in your original paper that you didn’t cover was what would happen if one of the inputs was actually a range, was like an array of items. And this is one of the cool things that we’ve worked on together, is like what if the example that you gave was maybe a row of like three items, but you want to generalize to a sheet-defined function that wouldn’t work just on a row of three items but maybe would work on a row of five items or any sort of array input size. And very often, we can do a sort of analysis of the formulas, and if they are processing those three inputs in a uniform way, typically it would be exactly the same formula as applied to each of them, a sort of parallel computation on the three inputs, and maybe with aggregates at the end where maybe you do some calculations and then you sum the results. Then that calculation can be sort of stretched—it can be elasticized, is what we say—to cover maybe an input of size five. So, we wrote this paper that I think is really cool that figured out a theory for when it is possible to generalize an example calculation of the grid that has fixed sizes to arbitrary sizes, and then even better, we worked with HCI experts who are in the Calc Intel team to do a user study to ask, “Well, how do people actually find this in practice?” And we did a study and found that they loved it, that it was a very natural way to program sheet-defined functions. So, that’s a paper I’m really proud of.
PEYTON JONES: It was a funny paper, though, wasn’t it, because it straddles everything from user studies right the way through to a proof of most principal generalization, but we found it quite difficult to publish.
GORDON: We did. We shouldn’t be bitter. Let’s not be bitter, Simon.
[LAUGHTER]
PEYTON JONES: We’re not bitter at all.
GORDON: We’re not bitter.
PEYTON JONES: And in our field, there are kind of three major programming language conferences— ICFP, POPL, and PLDI—and on this one, we got the clean sweep. We were rejected by ICFP, rejected by POPL, and rejected by PLDI.
GORDON: Bravo.
[LAUGHTER]
PEYTON JONES: That’s right. It was a great paper. It is now published in the Journal of Functional Programming.
GORDON: We’re laughing now.
PEYTON JONES: It’s a funny story, but there’s a thing behind it, which is that it is interdisciplinary work, right? It covers the stretch from programming languages theory through to HCI. That means it doesn’t actually fit in any of the existing buckets very neatly. So, that can be a challenge when you’re doing interdisciplinary research.
GORDON: I think it is a shame that these programming language conferences have a hard time, you know, accepting papers that use HCI methods. I’d love it if we could start to have maybe tracks of HCI research at these conferences or co-locate with HCI conferences.
PEYTON JONES: Yeah, but after all, what’s the purpose of a programming language? It is the user interface for the computer, isn’t it? The measure of whether a programming language is any good is whether it helps people get programs to work correctly, more quickly, so it’s an HCI problem, programming.
GORDON: It totally is, and having worked a bit with HCI people, I’m really humbled, actually. I mean, there are some great HCI folks in our team, and I learned such a lot from talking to them, and the thing is they really start from the users, talking to people, and asking them, “What are the sort of problems you’re having that might need some new features?” And then they try and build features and then compare them. In the SDF paper that I’ve just been talking about, we had two different ways of processing arrays, and then the study was actually a comparison of the two. And I’ve really come to respect HCI people for the kind of methods they do to actually get rigorous results when working with people. I think it would be really great if we could have more co-location of HCI research conferences with more kind of technical PL conferences, because I think people would learn from each other in ways that just doesn’t happen at the moment, when they’re separated out at different conferences.
PEYTON JONES: So, Calc View is another example of such a project, isn’t it, about sort of HCI aspects of programming? You want to just tell us a bit about Calc View? What is it?
GORDON: Yeah. We actually published this at an HCI conference called VL/HCC. So, it’s just a radically different view of the grid. We have a spreadsheet grid, but then we have this second representation of it, a bit like a programming language. So, we use a textual notation, and we would literally have the text, like, maybe “A1=2”, then maybe “A2=A1+1”. And you would actually see formulas like that, equations like that, sort of visibly as if it was like a program.
PEYTON JONES: A textual program.
GORDON: Like a textual program.
PEYTON JONES: You can edit with Emacs kind of thing.
GORDON: Exactly. And so, the idea of Calc View is that you have the grid, and then to the right of it, you have this textual view, and certain things are really easy to do in the textual view. Like, for example, we have this nice notation for a formula that has been copied many times. You know, so if the same formula is copied from A1 to A100, you could literally say, “A2:A100=A1+1”. And so, that’s a formula that would basically increment a counter. But textually, it’s a very short program, a bit like a loop, and if you want to sort of change it, you can just change the one copy of that formula, and then every copy in the grid gets updated. And so, we did a user study, and again, we found that people were more effective at certain tasks using Calc View.
PEYTON JONES: And what I thought was interesting about it was that in programming, we’re so used to the textual view of a program and we execute it later, and in Excel, we’re so used to just seeing the data and having it execute sort of online all the time, you barely see the formulae, and with Calc View, we’re trying to do it both at the same time and show you the data view continuously calc’d on one side of the screen and the textual view on the other side of the screen, and you can edit in either and it affects the other. I thought that was a really interesting project.
GORDON: So, Simon, when you started working with the Excel team, it wasn’t really clear, was it, what the formula language was? There wasn’t really a sort of programming language style description of the Excel formulas.
PEYTON JONES: Yeah, that’s right. When I first started, I sort of looked at the formula language and thought, “Hey, how hard can this be? You know, it’s just some formula. It’s pretty simple syntax,” and then the more I dug into what Excel actually does, it’s really hard to find good descriptions about it. You’ll find that, for example, references are first-class values. If you say “F(A1:A3)”, you’re not just passing a little 3 vector of the values in A1 to A3; you’re passing a reference to the range A1 to A3 as a first-class value to a function that can do sort of introspective things like, say, what is the row number of the last element. I didn’t really realize that at all. So it turns out there are quite a lot of wrinkles like this. So, I started to write down a paper called “The Semantics of Excel.” It was meant to be a document that says, “Never mind the implementation. This is the semantics of a formula,” and it was split into two in the end, the semantics of a single formula and then the semantics of calc, which is how is a whole spreadsheet full of formulas calc’d and how do we make sure that it’s always kept up to date and calc’d in the right order with the right cells being made dirty? That led to all sorts of interesting conversations, sometimes with the Excel engineers who implemented particular features, because really, only they knew the honest-to-goodness ground truth about this. Again, it was an interesting dialogue between somebody who’s coming at it completely from a semantics and programming languages point of view, and the engineers who’d actually built it. But there was no independent standalone description in that kind of way. And that semantics then led to, “Oh, let’s implement it. Let’s implement a reference semantics for Excel,” I think then as an F# program. So, that was meant to be, as it were, an Excel formula evaluator look-alike, a standalone, completely separate from the Excel code base, that implemented the reference model, because after all that, it could be a useful standalone entity. We thought that in an abstract way, and then you had the idea it wasn’t just abstractly useful—we might be able to concretely use it. That turned into Calc.ts, didn’t it?
GORDON: Yes. So, we did a hackathon about three years ago, mid-2017. So, we had this F# implementation that was actually getting pretty complete, at least the formula language. We didn’t have all the worksheet functions, but we could evaluate a lot of formulas, and we were starting to wonder, “Maybe we could take Excel formulas into other products.” And so, Excel and Word have got an add-in model based on JavaScript. So, basically, we did this hackathon where we added Excel formulas to Word. It was sort of this grand hack. We had this code in F#, and we wanted to turn it into JavaScript, so we used—oh, what was it called? Is it WebSharper? Yes, WebSharper is this sort of transpiler that compiles F# code to JavaScript. And so, we took this hunk of code and got it running in the hackathon, and then we made this silly video, and you were actually on the judging panel, but we won the MSR Cambridge hackathon that summer. It was great fun. Basically, we made a silly video that made people laugh, so that was great, but I mean, I think people liked the idea that we could take the formulas on a journey outside Excel, and then a really interesting thing happened, that this vice president of engineering, a gentleman called Aleš Holeček, was visiting Cambridge, was looking at our work, and you know, he has a view about what’s happening right across the different parts of Excel, and there is a web version of Excel, the web browser.
PEYTON JONES: This is Excel Online, right?
GORDON: Excel Online, yeah.
PEYTON JONES: Every time you go to a web browser and wake up Excel, that’s what you get.
GORDON: Yeah, and the way it works—at least the way it worked until then—was that it actually opened the Excel workbook in Azure inside a virtual machine, and then the client, the browser, was sort of like a dumb client, if you like; it didn’t do any calculation there. It did the UI where you could enter formulas and you could enter data, but everything was calculated in Azure.
PEYTON JONES: If you entered a number, it would have to send a message back to a machine halfway across the planet to add 3 and 4 and send the result back to the web browser.
GORDON: Exactly. So, it felt a bit sluggish. And there was a big push to make the product more usable, and so they decided they wanted to evaluate formulas in the browser. But unfortunately, the C++ code base that goes back to, I guess, 1985, when Excel started, it just wasn’t practical to run that code base in the browser, and so Aleš said to us, “You guys can help out. You guys need to go out to Israel because the Web Excel team is based in Herzliya in Israel, and they need a calc engine.” So, we had this amazing work trip out to Israel, met the guys and started collaborating, and we realized pretty early on that we couldn’t ship the F# code, so we did a big rewrite, and we rewrote it in TypeScript and then started a big push to write lots of the worksheet functions in TypeScript. And web Excel, because it’s a website, could move really fast, so the amazing thing was that, like, six months after the hackathon, the initial version of Calc.ts went live, and it was starting to speed up people’s calculations because it was doing the calculations in the browser. We had to do a bit of research to really make this viable, because not all the data in the cloud workbook ends up in the browser, so we had to have a calc engine that worked in the presence of partial data, and then we hit some precision problems because the original C++ code doesn’t use IEEE arithmetic, believe it or not, because it was originally written—
PEYTON JONES: Predated IEEE arithmetic. This is like code that has been around since forever. [LAUGHTER]
GORDON: That’s right. But although it wasn’t possible to do the following for all of the C++ code, we could use WebAssembly to take the core of the mathematics part of the calc engine that did 40-point arithmetic, and it was in C++, and we were able to compile that to JavaScript using WebAssembly or using that tool chain—first to WebAssembly, then back to JavaScript. And so now, we’ve got great performance numbers, so accuracy is, like, 99.95 percent. We can evaluate more than 99 percent of formulas, and a slightly goofy statistic is that there’s so many customers of Excel for the web that Calc.ts saves them a total of seven years every single day because it speeds up the response you get without needing to wait for the server.
PEYTON JONES: Also, it must be helping global warming, right? If you’re gonna add 3 and 4, then that should be one machine instruction. If you’ve gotta send it across the planet to a data center, that must be like bazillions of machine instructions, you know, that are just being wasted.
[MUSIC BREAK]
PEYTON JONES: Going back to Excel as a product, that brings us sort of more or less up to date to the, um, the release of the LAMBDA function. Maybe you could say a little bit about what that is.
GORDON: I think we’re all really super thrilled about having LAMBDAs in Excel. So, LAMBDA, it’s the same LAMBDAs that you have in programming languages. It’s an anonymous function. You can, say, double LAMBDA X—right now, you’d write in Excel as “LAMBDA X, X+X” and then that formula returns a function that if you give it an X, doubles it. Same as in any functional programming language. And this is going to be really great for users of Excel because it’s going to make formulas more readable. Excel formulas are notorious because they tend to be really complicated if you’re trying to do something clever, these notorious megaformulas that are really big things, that are really hard to read but are sort of squished down to a single cell, and LAMBDA will let you give names to parts of those, and we also have LET that let’s you give names to subexpressions within a formula. So, overall, it’s gonna make expressions much more readable and also much more reusable. You can define a computation once and then you can call it from many places, and just like in other programming languages, you get the great benefit that if there is something wrong with your initial definition or it was a preliminary definition for testing and you want to improve it and fix a bug, you just need to fix it in the one place and then that will propagate to all the uses.
PEYTON JONES: So, this is another funny thing, right? So, when Margaret and Alan and I were initially thinking about user-defined functions, we thought, “Well, we have to define it as a worksheet.” I don’t think I ever dreamt that we could offer something as exotic as LAMBDA, which sounded very geeky and programming-language-y, and yet, it serves the same function. You can define named abstractions, but because it doesn’t require much UI—it’s just another form of formula—it’s been much easier to introduce as the first form of the product. I still hope that we’ll get sheet-defined functions, but it turns out that LAMBDA, and this really is the full-on lexically scoped LAMBDA—you can define Church numerals, you can do the whole thing, you can define the Y combinator—this is real LAMBDA—it is now part of Excel, and that’s amazing. That makes the language Turing complete. Because we can write the Y combinator, you could really write literally any program in Excel now. But that’s a qualitative shift, right? So, certainly, it becomes computationally much more powerful, and because we can name these LAMBDAs, you can have functions that call functions that call functions. We can even write recursive functions. These named LAMBDAs can call each other.
GORDON: Yes. I mean, an example I love—it’s a very simple one, but just reversing a string. Believe it or not, although Excel has had lots of functions for processing strings, you can’t write a function—before LAMBDA, anyway—within the formula language that can reverse a string. But with LAMBDAs, you can just write a nice, little recursive function that does that. There’s real demand for a bit of functionality like that, but it wasn’t possible to do within the formula language.
PEYTON JONES: And I love the way this is sort of empowering users, right? Now, users can write these new functions themselves and, in fact, can write functions specific to their particular domain or area of expertise or work group, and then call those functions just as easily as the built-in ones. I think that’s a huge, huge change.
GORDON: It’s gonna be really interesting to see, you know, the kind of things people do and how we can share them. I’m hoping there’ll be a sort of open-source community build up around LAMBDAs.
PEYTON JONES: Yeah, I really hope that there’ll be a sort of third-party library and people who say, “Here’s my library for doing operations on”—I don’t know—“tensors” or something. And we’re gonna need to provide better support for libraries and richer data types, but I think that’ll come. And then people can produce libraries themselves.
GORDON: Yeah. It’s really great. I mean, ’cause we’ve been using LAMBDAs internally for a few years now, and it’s just a different kind of experience. It’s live programming. You’ve got a grid, you’ve got data just right in front of you. It’s just really highly interpretive. You know, you can change one LAMBDA one place and instantly all the changes percolate.
PEYTON JONES: You’re saying that even as a programming language person, doing functional programming in Excel with LAMBDAs and dynamic arrays feels quite different and more engaging than doing programming in, I don’t know, Haskell or F#. Is that what you mean?
GORDON: Yeah. Exactly.
PEYTON JONES: That actually leads me to just—we’re sort of, I guess, drawing towards the end. I was kind of reflecting on what have we learned from all this. One of the things that for me would be a very exciting outcome, if we can really make Excel into a language that doesn’t have this sort of glass ceiling that prevents you getting beyond a certain point, that lets you write functions that can call functions that can call functions that can have data that has data that has data inside it, then we could maybe imagine introducing programming to children for the first time through the medium of Excel. So, the child’s first programming language, maybe alongside something like Scratch, which is wildly popular, could be a spreadsheet, initially experienced in this very direct visceral way through just entering data and visualizing it with a chart, but then some simple data transformations using functions and then the ability to wrap those functions up into functions of your own. It would bring the pieces of my life together—my functional programming life, my Microsoft life, and my education life. It could bring that all together into, “We teach children programming using Excel for the first time.”
GORDON: I think that’s amazing. I think that’s a fantastic, exciting vision, Simon. Those are things you’ve worked on. I mean, it’s really cool that those things are coming together, but I actually really believe that the spreadsheet environment is a great place to learn because it’s so alive, you know? You make a little change and you can instantly see the results. It’s a really sort of evocative way to program and to learn.
PEYTON JONES: Anything else that you feel you’ve learned from this experience?
GORDON: There’s two things. I mean, I’ve touched on HCI and programming languages, so I won’t say so much about that except that I see a lot of excitement now in that interplay. There’s a few, you know, professors—I’m thinking of people like Eleanor Glassman or Amy Coe—professors who are on the intersection of programming languages in HCI and are doing really cool stuff, and there’s summer schools on the topic. So, I think that’s really growing as a research area, and I’m really excited to be part of that direction. The other thing is, like I said near the start, that before getting into spreadsheets, I’d spent quite a while looking at probabilistic programming, which is a way of expressing sort of Bayesian decision-making using code, and I sort of put that aside for a bit because I really, you know, I thought the way to achieve that, to sort of empower people to do Bayesian reasoning, to sort of use probabilistic reasoning to make good decisions, I think the way to get there is to get it into spreadsheets. And so, I think the kind of things that we’ve been talking about are a step in that direction. When I was doing probabilistic programming, that was really aimed at quite sophisticated users, people who are like data scientists, but I think we are seeing in society generally a really big need for the general public to understand probabilities and to understand uncertainty. I’m thinking about things like election forecast. I’m thinking about the various uncertainties around COVID. I think to be sort of numerate about these uncertainties, well, you need to be numerate to deal with them properly, to make good decisions, and to be numerate, you need to take the uncertainty into account, which is usually using probabilities. So, I think that the kind of features that we’re adding to Excel will make easier being able to visualize probabilities, to have charts that take probabilities into account, to sort of package up ways of visualizing probability distributions inside LAMBDAs so we can give textual descriptions of probabilities. People find, you know, a number like, you know, probability of 0.7 a little bit hard to understand; there’s a big idea called natural frequencies, where you basically remove the points and you can say what 0.7 means, and studies have shown that people respond much better to those kinds of natural frequencies—70 out of 100—than probabilities like 0.7. So I think that the programming features we’re adding to Excel will allow people to build libraries that make it easier to construct, to, say, represent distributions using natural frequencies, and eventually to make charts that represent probabilities nicely. And then a bit further out—this is quite a long-term dream—the idea of actually using Excel to build probabilistic models of situations will then be possible given the kind of work we’re doing on adding LAMBDA to Excel. So, that’s my long-term dream of where we’re gonna go with LAMBDAs and these other features in Excel.
PEYTON JONES: More generally, I’m quite excited about the possibilities of taking end-user programming seriously, by which I mean that because Excel starts from being an end-user tool—that’s what it has always been; by end-users, people who are trying to get some other job done—that if we can bring that constituency of end users on a journey with us to empower them, to give them the tools to express more and do more, it would be quite exciting to see how far we can go. And I think that functional programming using Excel, there’s a really high ceiling there that can go a long way. But I think just as we were talking about with Calc View and your sense that the programming using this just felt different, I think it may take us on a journey that isn’t altogether where we expect. [MUSIC STARTS PLAYING UNDER DIALOGUE] We’ve already seen that in the dynamics that we’ve discussed between us and the product group, so I think the future’s quite exciting there.
GORDON: Thanks, Simon. It’s been great talking with you this afternoon. We’re so excited about this work, and hope everyone listening is, too. Do check out the Microsoft Research website for more information. Thanks for joining us on the podcast.