Top 7 Pain Points Experienced by Hedge Fund Excel Users & Proposed Solutions
7 Minute Read
Published on April 15, 2019
Hedge Fund - Data Management, Integration & Analytics
Microsoft Excel Is a Great Tool Within Hedge Fund Industry
Most hedge funds extensively use Microsoft Excel as their de facto application because of its ease of use, flexibility, and powerful features, for performing data analysis, calculations, financial modeling, what-if scenarios, among other daily front, middle, and back office tasks.
In addition, hedge funds also use Excel to store huge amounts of data - which they then use for analysis, summarization, as well as reporting purposes. This practice is common, especially in the hedge fund’s middle-back office operations, where many Excel workbooks are used to store various kinds of data including data sets related to portfolio, trades, cash, counterparties, collateral, accounting, and investor performance data.
In this article, I will discuss the top 7 pain points based on my experience working with hedge fund users in the front, middle and back office. I will also discuss in detail how to get past these limitations in order to work smarter and get the most out of your time spent with Excel.
At a typical hedge fund, the Excel workbooks are stored in some form of a folder structure on a network drive that matches a structure similar to this – FIRM / DEPARTMENT / BUSINESS FUNCTION / TIME PERIOD. For example, XYZ Fund / OPERATIONS / RECONCILIATION / 2019-04-06. Although, the folder structure is well thought out, the management of data stored in multiple workbooks can become overwhelming over time. A file folder structure has to be created, and strictly maintained over time to ensure that workbook files for different periods are named and stored correctly in order for this setup to work. It is also paramount that the names for the workbooks follow the specified convention – otherwise it can become difficult to identify and retrieve these workbooks as well as lead to errors when running macros that refer to these workbooks.
Making correct backups on a consistent basis over time of these workbooks, and their data, is also error prone. For example, if your operations department needs to work with 10 linked workbooks in order to perform a certain analysis, each individual workbook must be backed up separately to ensure that the complete set is backed up. In addition, there is no way to confirm if you have omitted any files from your backup or included the wrong workbook.
To a large extent, the success of the whole process depends largely on the domain knowledge and competency of the specific individual(s) working with these workbooks on a daily basis. Every so often, if or when the author of these workbooks, who is also typically in charge of maintaining the naming conventions, is on leave from the company, this knowledge is unavailable. Even worse, if this person leaves the firm, that knowledge may be lost forever. In addition, the person's successor might decide to start their own file naming and storing convention. This results in the whole analysis, summarization, and reporting process to fail. This also means, that future reports based on these linked worksheets can fail because one or two worksheets are now missing from the backup.
Furthermore, using workbooks is difficult because data consistency and validity are hard to enforce when the process is done manually. Also, accidently deleting records (rows, columns) from a workbook will not generate an alert until the summary report is re-run.
A proposed solution to prevent these operational issues from happening would be to transfer all the relevant data into a relational database like Microsoft SQL Server, and then use Microsoft Excel as the front-end to analyze this data.
What are the advantages of integrating a database with Microsoft Excel? Databases prevent data integrity violations through the adoption of database security policies that restrict data modifications to data meant to be used for reporting. Also, database auditing features allow the ability to check and see if any unauthorized modifications have been performed on the data being analyzed.
In summary, using a database will ensure data validity, consistency, and prevent data from accidently being deleted.
Does this scenario sound familiar at your hedge fund? At the end of a time period (day, week, month), multiple Excel workbooks are linked together in order to generate another set of summarized workbooks which are then used to prepare numerous financial reports. This whole process is then repeated at the end of each period, or as and when a request comes in from the investment management team or a current or potential investor. Eventually the hedge fund ends up with tens, if not hundreds, of Microsoft Excel workbooks, storing data in inconsistent formats, authored by numerous employees from the front, middle and back office operations, often on a network accessible drive.
Unfortunately, these workbooks need to be managed very carefully and a change in any workbook can break the entire operations, and reporting setup. Also, the complexity of the workbooks increases over time to a point where it is hard to determine the missing piece, or the broken link, when the summarized financial reports fail or provide wrong analysis.
So, how can a hedge fund move away from creating hundreds of workbooks into a more robust, agile, efficient, and error free operations environment? A practical solution is to integrate a database with the current Microsoft Excel set up. In this scenario, the relevant data is stored in the database, and Excel is only used to perform data analysis, summaries, and reports.
In addition, Microsoft Excel has a maximum record limit – and cannot store more than 1,048,000 rows of data records. This means that if you are storing large data sets over long periods of time – you will eventually exceed this limit and have to use multiple workbooks. Additionally, most hedge funds will separate their workbooks by the smallest time period that makes sense for the hedge fund. In most cases, this will mean that the fund is storing all the workbooks by daily time period, thereby creating a new folder for each trading day.
When compared to the limitations of Excel, there is virtually no limit to the number of rows that can be stored in a database. By storing relevant data inside a database, it becomes much easier for a hedge fund to manipulate, analyze, and summarize data for reporting to the investment management team, compliance reporting, as well as investor reporting. In addition, now you have the ability to perform advanced analysis using powerful BI tools like Microsoft Power BI.
Using a database ensures that the relevant front, middle, and back office data is continuously stored in one central location over long periods (years) of time – which ensures uniformity and consistency. With this arrangement, accessing MTD and YTD data is as easy as accessing data for the previous trade date. The user only has to specify the date ranges for the data needed to retrieve, and the database software will retrieve the matching records as per the specified criteria. Data backup is also simpler because only a single database containing all data records needs to be backed up.
Databases also allow the creation of data records that are summarized and optimized for hedge fund front, middle and back office reporting purposes. This is similar to what is contained in summary workbooks that have aggregated data over specified periods of time. This data can include summarized portfolio data by fund, geography, client, or internal as well as counterparty trade settlement data for reconciliation purposes. With the data being retrieved from the database, the analysis for daily, weekly, monthly, and quarterly time periods will take the same amount of time and effort. By comparison, if you wanted to perform a similar summarization in Excel, the amount of time and effort needed will increase with the time period for which the analysis needs to be done. This means that the chance of errors increases significantly as longer time periods are selected.
Another point to note is that we would expect differences in data formats between the workbooks – which would make it more difficult to combine data.
A typical error that might occur is where one column is defined as a numeric field in one workbook and then defined as a character field in another workbook. Combining the two data types using formulas will fail – and the individual data columns would have to be converted to ensure they match before aggregating them together.
If we were to take an example where a single workbook is used to store portfolio data by fund, geography, investor, etc., for each month – as is typically the case in most hedge fund middle and back offices. After 5 years there would be a total of 60 workbooks to analyze for each type of summarization. Alternatively, using a database to generate a similar summary report would be much more efficient because you would retrieve, segment, and summarize the data using a single database query to get the desired results, regardless of the time period being selected.
Does this scenario sound familiar? After the close of the trading day, your operations team has been working to reconcile cash and positions with the fund administrator and provide the investment team with the most accurate picture of the portfolio. All of a sudden they go to open one of the Excel workbooks, and their computer crashes, resulting in a complete loss of the summary workbook that they have been preparing all day.
When using Excel workbooks, data can be easily lost for any of the following reasons:
When this happens, the only way to recover this data is by restoration of the file from the most recent good backup of the Excel workbook. As most hedge fund operations team members know all too well, this is a tedious and time-consuming process. The automated file recovery features in Excel are not reliable to say the least.
Databases can be automatically backed up on a consistent (daily, weekly, monthly) basis. For example, a data backup can be performed at midnight every day after the end-of-day processing and reconciliation has been done. This means that users would have various backups available depending on how far back they would like them to go. Some databases like MS SQL Server have built-in advanced backup capabilities that allow data to be restored up to the immediate point of failure.
Databases can also be configured to know where to store the data backups and where to look for database restoration data. However, with workbooks, someone has to manually record this information, introducing the risk that backups can't be performed if nobody knows where the backup files were stored. And usually each workbook would require its own backup file. Therefore, after 5 years, 60 backup files would be needed for just one business function within the middle-back office – assuming there was one Excel workbook for each month of related data. Clearly, the number of backup files can quickly get out of hand as time progresses. Sometimes, these network drives, where the Excel workbooks are being backed up, are not accessible from each end user’s computer, so IT needs to get involved in order to access to these backups.
If you have had to work with Excel workbooks where the format of cells was not consistent among workbooks from different time periods, you know how much of an issue data integrity can become. One of the primary features of a database is that it allows us to define a data model which specifies and enforces the format and the validity of data that we will be storing. The data model takes into account the following requirements:
Every time an analyst enters data into the database, the pre-defined data model ensures that only valid data is entered. This means that you would be able to avoid the following errors that are typically associated with workbooks.
Another valuable feature of a database is called Referential Integrity. Referential Integrity allows us to only accept data whose master records already exist. It also ensures that the master record cannot be deleted until all dependent child records have been deleted.
For example, this database feature can ensure that an investor record will not be deleted from the database until all investor related records, like contact information, allocation into a fund, P&L, return, etc. are removed. It can also ensure that investor related details can't be entered, until the main investor record has been created. As a result, data integrity is enhanced by ensuring no phantom records are sitting within the data and ensuring relationship integrity among data elements.
From the hedge fund investment team in the front office, to the operations, finance, and accounting teams in the middle and back office, once the relevant data is being stored in a centralized database, it becomes very simple to manipulate vast amounts of data to drive insights that inform the investment decisions, and help the hedge fund generate more alpha. In the middle office this data can be used for internal, compliance, regulatory and investor reporting.
Some of the more advanced analysis that is possible once the data is in the database include the following:
Hedge funds looking for more advanced analytics can also incorporate Microsoft Power BI or similar online tools to run on top of their data. These reports could be scheduled to be automatically generated, in multiple output formats. They can also be scheduled to be emailed to the relevant users within the hedge fund.
Excel workbooks are designed to be used by a single person at a time. This makes it impossible for multiple users to access and modify different parts or worksheets, while someone else is working on a different worksheet within the same workbook. Databases on the other hand are designed with concurrent usage features that are built in, which makes it easy for multiple people to seamlessly add, delete, insert or update data in the same database simultaneously. In a typical scenario, when multiple team members try to update the same record at the same time – the first user who initiated the update will always have first priority and the second user will be notified that they cannot update the record at that moment – until the first user is done.
Database multi-user capabilities are very useful for data entry projects where different individuals are entering data. This will resonate well with the middle, and back office users, such as the accounting and tax departments. The only way to achieve this in Microsoft Excel is to create multiple workbooks containing each worksheet that can be worked on by separate team members, and then combine the worksheets back into the main workbook at the end of each day to have a continuous, consistent record for the time period’s activities.
This extra, added manual step is not needed when using a database, as all inputs from multiple users would go straight to the same database. Reports on this data can be generated instantly without waiting for any further data consolidation.
With risk mitigation being one of the top concerns for a hedge fund investor, it is important that operational risk be a priority for a successful hedge fund. In most hedge funds, an operations analyst, working with a Microsoft Excel workbook will have access to all the of worksheets within the workbook, despite the fact that they may only need access to one or two worksheets, or in some cases only a small section of a specific worksheet. When using Excel for data storage, there is no way to limit access to any of the data within the workbook(s).
For example, an operations analyst, performing end-of-day P&L analysis for a portfolio, could also have access to the worksheet storing the investor P&L data. This means, that the analyst will know what the P&L is for each investor participating in that fund. In most instances, the hedge fund trusts its employees, and this should not be a problem. However, in order to show a potential investor that you have tight operational risk controls in place, you could advise them that your fund has the ability to limit access to their personal, and financial data.
Another reason to limit full read and write access to the data is that an operations analyst with unlimited access might accidently change or delete the contents of a cell or row in a worksheet without even knowing it. These errors are very difficult and time consuming to detect, and correct.
On the other hand, if the hedge fund is storing the data in a database, data security can be easily put in place to ensure that the access to data can be limited in both scope, and range based on each individual operations analyst's exact role and project requirements. In addition, as is typically the case for most hedge funds, the default access could be a read-only access for most operations analysts, thereby preventing the data from accidentally being deleted. This kind of permission control is very difficult to implement with Microsoft Excel workbooks, since you are restricted to either giving full read, write access to everyone or denying the same permissions to everyone. There is only one level for security available which is driven by the Microsoft Excel workbook password.
Integrating a database as your back-end within your hedge fund operations, allows a fund to collect relevant operational data during the trading day, over long periods of time. This data can then be used for various front, middle, and back office functions. Using databases, along with Microsoft Power BI, also allows a hedge fund to perform more advanced analytics like real-time dashboards, drill-down analysis, segmentation and cohort analysis, data grouping and Binning, predictive analytics, etc. In addition, it is relatively easy to email this analysis to the relevant parties, as well as make it web and mobile accessible.
Excel, when combined with a database as the back-end to its front-end, will provide a long term, reliable approach for most hedge funds looking to grow AUM. The mindset shift is that Microsoft Excel should primarily be used for data analysis, testing financial models and what-if scenarios. It should not be used for data storage purposes. The data used by Excel workbooks should be extracted from a database which should be used to store, manage and accumulate data over long periods of time.
This approach allows a hedge fund to efficiently and successfully continue using Excel for data analysis, but handle data storage, integrity and management with a database.
Signup for our monthly newsletter, that contains our exclusive research, whitepapers and best practices on hedge fund technology.