This part is the continuation of part I.
Machine learning should classify the data transmitted to it as efficiently as possible and without serious deviations. Consequently, all relevant characteristics must be selected and taken into account in order to be able to classify the relevant data. There are a total of three different procedures for annexing the existing data.
In contrast to the learning method of unsupervised learning, no conclusions are made by the machine itself in the supervised learning. This means that the cognitive effort in the process of selection takes place on the part of the teacher (eg the programmer). In contrast to supervised learning, the above-mentioned method of unsupervised learning even conceives new theories through the machine. In order for this system to obtain structured and usable knowledge through experiments and observations, a considerable number of conclusions have to be made. There are not only the above two elementary learning methods, supervised and non-supervised learning, but also so-called reinforcement learning. With this method, the machine learning method receives specific feedback in the form of reinforcements, rewards or punishments.
Learning approaches & algorithms
In order to provide information on how the data to be learned can be classified using a wide variety of algorithms, this section describes three exemplary approaches to classifying the data. The goal of artificial neural networks is to imitate the way the human brain works and to present it as a mathematical model, This form of classification is then used to identify complex relationships between input and output values. Artificial neurons simulate the functioning of a biological nerve cell and are composed of individual units that consist of four different components: soma, dendrites, axon and the synapses. The mathematical equivalent to the biological nerve cell also consists of the soma, which is divided into three basic functions: input function, activation function and the actual output, which are forwarded to the output links (axon). A weighting is assigned to each individual input link. This weighting or amplification of certain input signals (dendrites) is carried out by the so-called synapses (bias weight). In contrast to the above-mentioned classification method for neural networks, the Naive Bayes classifier, which belongs to the family of monitored learning processes, assumes that every single element exists autonomously from the remaining elements. This means that each individual characteristic provides an independent probability contribution that an object belongs to a certain class. The Naive Bayes filter is used to assign emails to two specific categories: spam or non-spam. A modification of the equation listed above results in a construct that is used to classify emails in the two categories already mentioned.
The problem of classifying complex facts is significantly simplified by this algorithm. Another major advantage of the “Naive Bayes” classifier is the high speed of convergence and the lower memory requirement during the training phase of the machine to be learned. Another approach to differentiate the data sets to be learned is the k-nearest neighbor principle. Like the Naive Bayes classifier, the K-Nearest Neighbor method is part of the method of supervised learning. The concept of the “nearest neighbor” makes the decisions based on the closest training objects in the feature space. The closest neighbors are included in a majority decision.
---
The k-Nearest Neighbor principle is used, for example, in the area of credit checks. In order to determine which person pays or does not pay their bill, these people are divided into two specific classes. The classification takes place only by means of the characteristics age and the amount of income. Other influences are not considered in this example. In order to determine whether a person A would pay the amount of an invoice, the class affiliation of the five closest neighbors are now evaluated and compared with person A, for example. With regard to the characteristics of age and income, in this example three out of five people with the closest characteristics in the characteristics space have paid their bill in the past. Only two people did not complete the payment process. Accordingly, person A is assigned to class 1. A major advantage of the “k-Nearest Neigbor” classifier is that in principle no training of the machine to be learned is required, but only the feature vectors of the individual objects and their class label are saved.
Basics of Financial market
In all financial markets worldwide, funds are transferred from the capital provider (investor) to the capital demand (investor). The area of capital providers is made up of the private sector. Due to the provision of capital by investors, they receive a countervalue, which is often distributed in the form of securities. The investors are made up of two parts: companies and the state. The intermediation of the capital employed, be it equity or debt, is carried out by so-called financial intermediaries. These include, for example, banks, insurance companies, building societies and fund companies. Accordingly, these institutions act as transport companies for the financial assets.
The part of the capital that is not used to finance domestic property investments flows abroad as a net capital export. The net capital export is calculated by capital exports minus capital imports. The logical conclusion is that from the domestic point of view, financial assets in the form of net foreign claims increase, while abroad, on the other hand, enters into net liabilities to the domestic market, so that financial assets decrease.
The interaction of various actors is necessary to ensure the daily operation of the financial market. The main task of the Central Bank of a country is to guarantee the monetary stability. This includes stimulating business cycles but also recognizing recessions at an early stage in order to counteract them positively. Above all, however, they should ensure a certain level of price stability, i.e. avoid both an increase in the price level of more than two percent annually and a decrease in the price level. Colloquially, the central bank is the bank of the state and the banks. Thus, a central bank acts as the last source of funding for the credit institutions and acts as the house bank of the state.
Commercial banking is one of the primary tasks of commercial banks. In the commercial banking is accepting customer deposits and the granting of debt. The commercial banks also organize general payments and currency exchange. In addition to the central banks, the commercial banks are, without exception, one of the most important players in the financial market, since their areas of responsibility are oriented towards the entire economy. In addition to commercial banking, there is so-called investment banking, This branch of banking deals primarily with activities such as client asset management, securities trading, mergers and acquisitions (mergers and acquisitions) and corporate finance (corporate finance).
These are companies which take payments as part of their business activities ( building societies, insurance companies ) and invest and manage this capital for a while. In addition, other institutions such as foundations, universities or churches invest their financial resources, which flow together through membership fees or sponsors. There are also so-called pension fundswhich are filled by contributions from the employer and the recipient.
Investment companies are special companies that issue investment funds. Most of these companies are subsidiaries of banks. The concept of the investment fund works according to the following principle: Investors invest a certain amount and receive fund units in return for their deposit value. Through such sales of shares, the investment fund ideally brings together several hundred million euros, which can be seen here as a “pot” in which the investors invest. By purchasing such fund units in a diversified portfolio, the risk of the investment can be minimized, this principle of investment is interesting for customers who are looking for a less risky investment opportunity.
The market for long-term financing (term of more than 4 years) includes the capital market. Usually, the capital market is also referred to as the securities market, since mainly marketable securities such as shares or units in investment funds are issued and traded on it. A distinction is made in the bond and stock market. The aforementioned bond and stock market is divided into primary and secondary markets. Opposite of the organized capital market is the unorganized capital market. This deals mainly with the direct or indirect trading of loans, participations and mortgages between providers and customers. The contrast to the capital market is the so-called money market. This deals with markets for short and medium-term financing. The capital market as such, which has the task of organizing the medium and long-term investments or borrowing, consists of two parts: (A) Organized capital market (B) Capital market which are not organized
These two areas of the capital market mainly deal in securities. These include shares and industrial, bank and government bonds as well as shares in investment funds. Another component of the capital market are the mortgage loan markets. All these types of investments are designed for long-term financing.
The organized capital market mostly deals with all long-term transactions involving credit institutions and capital collecting points. The most pronounced form of the organized capital market is the stock exchange. The organized capital market is constantly monitored by the state. The counterpart of the organized capital market is the unorganized capital market. This part of the capital market includes, in particular, credit relationships between companies, for example in the form of a long-term supplier credit. Credit relationships between households and between companies and households are also part of the unorganized capital market.
Securities are a very good vehicle for bringing capital into circulation. Capital-seeking companies or the state launch securities in the form of money market paper, bonds or shares, which are usually sold by banks or a group of banks to capital investors. This first-time capital brokerage takes place on the primary market. The demand for capital, in this case the company that issues its securities, receives the corresponding equivalent for a certain number of shares. This process can be clearly recognized by the flow of money or securities. The capital investors who, for example, acquired shares in a company during the first issue of the securities on the primary market, can put these acquired securities into circulation on the so-called secondary market. The securities are only sold to other capital providers. In summary, the secondary market is a platform for trading in securities that are already in circulation. This trade in securities that are already in circulation is also known as the stock exchange and is subject to certain legal regulations and controls. Another sector of the secondary market is over-the-counter trading, which is also called “telephone trading” which are already in circulation is also referred to as a stock exchange and is subject to certain legal regulations and controls. Another sector of the secondary market is over-the-counter trading, which is also called “telephone trading” which are already in circulation is also referred to as a stock exchange and is subject to certain legal regulations and controls. Another sector of the secondary market is over-the-counter trading, which is also called “telephone trading”.
In addition to brokering capital, the financial markets have other functions, such as lot size transformation. With this type of transformation, many small amounts of investment are bundled by the banks to enable large investments to be financed. In addition to the lot size transformation, there are also deadline transformations and risk transformations. The stock exchange is an elementary part of the financial market. The basis for forecasting financial data such as stock or fund histories are the historical data, which must be taken into account. Since progressive computer systems of the 21st century can analyze this enormous amount of data much faster than humans due to different classification algorithms, more and more analytical tasks are delegated to machines. The prerequisite for carrying out these analytical tasks is the establishment of defined framework conditions.
Data acquisition and preprocessing
A relevant database must be available as a premise for forecasting using progressive computer systems of the 21st century. When creating the aforementioned data premise, it is necessary to carefully select which data should be taken into account, since every non-target-oriented data record falsifies the data premise. When collecting the data premise, it is conceivable, due to the survey methods and the nature of the data to be collected (video, sound, etc.), that the data premise is adversely affected. The recording of a voice pattern is distorted, for example, by background noise. As a result, the data premise is incorrect. It is imperative that the compromising sounds are isolated from the voice pattern and removed before this data set becomes part of the data premise. Such corrective intervention in the front is called preprocessing. Without preprocessing, it cannot be guaranteed that the data premise is relevant and meaningful for the intended purposes. When creating the data premise, exogenous data must also be taken into account on a mandatory basis. Using the example of a share price, it becomes clear that the course of the price depends on political decisions, regardless of the type of course (positive or negative). Unconsidered exogenous influences have a huge impact on the forecast quality. From this knowledge, it can be concluded that known and foreseeable exogenous factors must be included in the data premise.
When studying the data premise, characteristics emerge with differentiated informative value about the object under consideration. A viewing object is defined as a single object of the data premise. If the example of a share performance is used, it is conceivable that the features explicitly outline the object under consideration. Such characteristics are called direct characteristics. It is also conceivable that features exist that only sketch the object under observation directly if they are studied in conjunction with other features. These characteristics are called indirect characteristics. The aim of the cooperative study of indirect characteristics is to establish new direct characteristics. The merging of characteristics thus becomes an essential constituent component when studying characteristics, in order to achieve a direct sketch of the object under consideration. Characteristics with valid informative value are not rejected if they are merged with a non-valid characteristic. Features that relate to identical objects under consideration may have differentiated forms. Taking this situation into account, it is necessary to standardize the features studied in order to define the object under consideration generically. Features with differentiated forms can be generalized. When generalizing characteristics, it must be remembered that mandatory general data is lost due to generalization. It is necessary to decide which level of specification and which generalization level are required.
Validation of the data collected
The classifications created must be critically assessed as to whether they describe the object under consideration sufficiently. For this purpose, the existing data can be divided into three different parts. A part is needed to adapt the classifications. Another part is used to check the classifications so that parameters for the creation of forecasts can be created on the basis of these classifications. The third part of the data is used to verify the validity of the defined parameters. To ensure that the parameters are verified as precisely as possible, the number of verification runs must be maximized on a mandatory basis. A maximized number of verification runs guarantees higher precision. In this way it is feasible to verify.
In the next part, we will discuss the analysis techniques.