Thursday, April 11, 2024

Run Datascience workloads on OCI with GraalVM, Autonomous Database and GraalPy


GraalVM is a high-performance polyglot virtual machine that supports multiple languages, such as Java, JavaScript, Python, Ruby, R, and more. GraalVM can run either standalone or embedded in other environments, such as the Oracle Cloud Infrastructure (OCI).
The GraalVM Stack 
Data science is a fast-developing field that uses computational techniques to extract valuable insights from extensive and intricate datasets. Data scientists employ numerous tools and languages, including Python, R, SQL, and Java, to carry out data analysis, visualization, and machine learning tasks.

Working with large volumes of data and using different tools and languages can be a challenging and inefficient task for data scientists. Furthermore, traditional platforms that are not optimized for data-intensive applications may result in performance issues while running workloads.

Oracle Cloud Infrastructure (OCI) provides a po rful solution for data science workloads through GraalVM. GraalVM is a high-performance virtual machine that supports multiple programming languages such as Python, R, Java, JavaScript, Ruby, and other languages. With GraalVM, data scientists can effortlessly integrate different languages and libraries within the same application, without compromising performance or interoperability.

GraalVM has a significant feature, GraalPy, which is a speedy and compatible implementation of Python running on GraalVM. With GraalPy, data scientists can execute their present Python code on GraalVM with minimal modifications, taking full advantage of GraalVM's speed and scalability. Moreover, GraalPy offers effortless access to other GraalVM languages and libraries, including R, Java, and NumPy.

Another advantage of using GraalVM for data science workloads is the integration with Oracle Autonomous Database (ADB), a fully managed cloud database service that provides high availability, security, and performance for any type of data. ADB supports both SQL and NoSQL data models, as well as built-in machine learning capabilities. ADB also offers a dedicated Data Science service that allows data scientists to collaborate and share their projects, models, and notebooks on OCI.

By combining GraalVM, ADB, and Data Science service, data scientists can leverage the best of both worlds: the flexibility and productivity of Python and other languages on GraalVM, and the reliability and scalability of ADB on OCI. In this blog post, I will show you how to run a simple data science workload on OCI using GraalVM, ADB, and OML4py.  Furthermore, this is
 a basic setup of how to use GraalVM on OCI with the Autonomous Database and Python for data science applications.

Prerequisites


The basic prerequisites for running your workloads are:

  • An OCI Cloud environment and a compartment with the necessary permissions to create and manage resources.
  • A GraalVM Enterprise Edition instance on OCI. You can use the GraalVM Enterprise Edition (GraalVM EE) - BYOL image from the OCI Marketplace to launch a compute instance with GraalVM EE pre-installed.
  • An Autonomous Database instance on OCI. You can use either the Autonomous Transaction Processing (ATP) or the Autonomous Data Warehouse (ADW) service, depending on your workload.
  • A Python development environment with pip and virtualenv installed. You can use the GraalVM EE instance as your development environment, or you can use a separate machine with SSH access to the GraalVM EE instance.























This diagram shows a simple setup of running your workload in the cloud. For production purposes it might be more complicated.


When you create an OCI Compute node, you can follow the steps to install GraalVM and GraalPy. GraalPy is a Python implementation based on GraalVM, a high-performance polyglot virtual machine. GraalPy allows you to run Python code faster and more efficiently, as well as interoperate with other languages supported by GraalVM. 

Specific components

To run datascience workloads you might use the following components

  • Graalpy is a Python implementation that runs on the GraalVM, a high-performance polyglot virtual machine that supports multiple languages such as Java, JavaScript, Ruby, R, and Python.
  • Oracle Autonomous Database is a cloud service that configures and optimizes your database for you, based on your workload. It supports different workload types, including Data Warehouse, Transaction Processing, JSON Database, and APEX Service.
  • Graalpy workload is a type of workload that involves running Python applications on the Oracle Autonomous Database, using the GraalVM as the execution engine. This allows you to leverage the performance, scalability, security, and manageability of the Oracle Autonomous Database for your Python applications.

A possible workload on an Autonomous Database is a data analysis and machine learning application that uses the Oracle Machine Learning for Python (OML4Py) package. OML4Py is a Python package that provides an interface for data scientists and developers to work with data and models on the Autonomous Database. The package utilizes the in-database algorithms and parallel execution capabilities of the Autonomous Database, making data analysis and machine learning more scalable and efficient.

To run this application, you will need to install the GraalVM Enterprise Edition on your Autonomous Database. Then you can create a Python environment using the GraalVM Updater on a compute node where GraalVM is installed. After that, you can use the cx_Oracle module to connect to your database. Additionally, you will need to install the OML4Py package and its dependencies using the pip command. Finally, you can use the OML4Py API to load data from your database, explore and transform the data, create and train machine learning models, and evaluate and deploy these models.



Here is a code snippet that shows how to use OML4Py to create and train a logistic regression model on the iris dataset, which is a sample dataset that contains measurements of different species of iris flowers. Specifics like usernames and passwords you can get from your own setup.

# Import OML4Py and cx_Oracle modules
import oml
import cx_Oracle

# Connect to the Autonomous Database using cx_Oracle
connection = cx_Oracle.connect(user="username", password="password", dsn="dsn")

# Create an OML connection object
omlc = oml.connect(connection)

# Load the iris dataset from the database
iris = oml.sync(table="IRIS")

# Split the dataset into training and testing sets
train, test = iris.split()

# Create a logistic regression model
model = oml.logistic_regression("Species ~ SepalLength + SepalWidth + PetalLength + PetalWidth")

# Train the model on the training set
model.fit(train)

# Print the model summary
model.summary()

This script and the Iris trainingmodel is described at https://shorturl.at/orwDR by Mark Hornick
To implement GraalVM Enterprise Edition on Oracle Cloud Infrastructure (OCI) compute node with Autonomous Database (ADB) and GraalPython, you need to follow these steps:

1. Create an OCI compute node with the desired shape and operating system. You can use the OCI console, CLI, or Terraform to do this.
2. Install GraalVM EE on the compute node. You can download the latest version from the Oracle Technology Network (OTN) or use the OCI Resource Manager to provision it automatically.
3. Configure GraalVM EE to work with ADB. You need to set the environment variables JAVA_HOME, GRAALVM_HOME, and TNS_ADMIN to point to the GraalVM EE installation, the GraalVM EE home directory, and the directory where you store your ADB wallet files, respectively. You also need to add the GraalVM EE bin directory to your PATH variable.
4. Install GraalPython on GraalVM EE. You can use the GraalVM Updater tool (gu) to install GraalPython and its dependencies. For example, you can run `gu install python` to install GraalPython.
5. Download the client credentials (wallet) from the ADB service console and set the TNS_ADMIN environment variable to the path of the wallet directory. For example, run the following command:

export TNS_ADMIN=/path/to/wallet

7. Install the python-oracledb driver on GraalPython using the pip tool. For example, run the following command:

$GRAALVM_HOME/bin/pip install cx_oracle

8. Test your GraalPython installation and connection to ADB. You can use the GraalPython interactive shell (graalpython) or run a GraalPython script to connect to ADB and perform some queries. For example, you can run `graalpython connect.py` where connect.py is a script that uses the cx_Oracle module to connect to ADB and execute some SQL statements.
To connect to Oracle Autonomous Database from your Python application, you can use the following code to connect to the database:

import oracledb
# Set the TNS_ADMIN environment variable to the path of the wallet directory
import os
os.environ['TNS_ADMIN'] = '/path/to/wallet'
# Connect to the database using the service name from the tnsnames.ora file
conn = oracledb.connect(user='username', password='password', dsn='service_name')
print(conn)
conn.close()

To connect to the database, you need to place the wallet of the ADB in a 
specific location. You can obtain the service name from your ADB in the OCI console.
This should give you a good start to experiment with GraalVM, GraalPy and 
Data Science in the Oracle Cloud. It's a powerful solution for your 
production workloads, and starting with the basics will help you explore 
the possibilities.



Tuesday, April 2, 2024

From introvert person to public speaker

I write and speak a lot about technology, but a personal touch and experience can be nice and interesting sometimes too. This personal touch is telling the journey of how an introvert person like me became a public speaker, and I hope you get some inspiration out of it, maybe taking your first step to speaking in public.

Drive and enthusiasm

Now not everyone feels the need to speak in public, so you need to have a drive to want that, it’s an obvious fact. For me, I wasn’t really keen in the beginning to speak in public as I have a shy nature, but the most important tips are: know what you want to tell, and practice, practice, practice.

And even the most important thing: have fun doing it!

Knowing what you want to tell, especially in the technology area, but I suppose it also applies to other areas where people are public speakers, begins with doing research, combining you day to day experience and view on the world all in one. All begins with a good idea of direction. Actually my “public speaking career” started with the authoring of a book about technology, a beginners guide for starters.()

https://bit.ly/2Kk8seB
https://bit.ly/2Kk8seB

After speaking on an event, I noticed I enjoyed it, and as times goes by people are getting to know you more and you will get more enthusiastic about speaking and public.

This doesn’t mean that everything always goes well. Sometimes the subject to choose is not interesting enough, sometimes your own performance needs improvement, etc. It depends on multiple factors, but as a speaker you can have a lot under your own control, even your own nerves. And believe me, even the most experienced speakers feel excitement if they are about to speak in front of a large audience.

Prepare yourself

This seems like an open door but knowing what to tell helps a lot in gaining self confidence. My experience is to no include to much in my speech. I did that a few times and ended up in not telling everything I wanted. There are also a lot of tips available of how and what to present and they all apply, make a presentation as visual as you can. It’s better not to have slides with a lot of text, but include those in the comments.

So preparing means, choose the subject, prepare some slides and write your story down for yourself. It helped me a lot, but still the uncertainty remains about being complete. Well, in my talks I am sometimes overcomplete. So time management is really important, if you want to cover all the content.

I also followed some courses of how to speak in public, and the most important things I extracted out of it:

  • Do some storytelling — talk about normal day things to visualize wat you want to tell. The audience will recognize it and they will be more eager to listen to your story. Interacting with the audience gained my selfconfidence.
  • Have a good begin and end. Which means that you have to structure your talk so the audience can think it over after the talk
  • Know your position; does everyone see me good? Is the tone of my voice not too boring? Do I stand as someone with selfconfidence?

You also don’t want to be disturbed by failing devices or struggling with your presentation devices. So I always am present way before I start to present, to explore the room or hall and know what’s there.

Some presentations give demo’s. That can be a nice thing, but don’t make a demo too long or too complicated. I’ve seen a lot of demo’s where a lot of things are happening and code is flying on the screen, but they don’t add anything to the story. Prepare what you can prepare and demo a short and clear case.

My personal touch

Now, to prevent this story from becoming just a collection of hints and tips, I would like to add some personal touches. Remember that not all hints and tips work for everyone and they can be found everywhere. I love interacting with my audience during my talks. I enjoy seeing their reactions, from those who are interested and engaged to those who get a bit sleepy and start to close their eyes. Unfortunately, these days, all conferences are virtual, and we don't have the same level of interaction.

To keep my audience engaged, I always try to inject some humor and fun into my talks. I believe that getting your audience to laugh can boost your self-confidence and keep them interested.

However, even with all my experience, I still experience what's called "the imposter syndrome." This means that I sometimes fear being exposed as not being the expert people think I am. But I've learned that as long as I receive useful questions and see that people are interested and engaged, I don't have to worry about being "nailed."

Sometimes, I lose track of the structure of my story when I'm telling it. I get stuck and forget what I wanted to say. However, I've found that the best way to get back on track is to go back to my topics and start again from where I began.

Remember that not all talks will go smoothly all the time. I always question myself after a talk and ask if the topic was well-suited, if I was unclear, or if my tone was too monotone. Every talk you give can be a learning experience for the next one.

Support from the Oracle ACE community

Since I joined the ACE program in 2012 and was awarded in 2019 to Oracle ACE Director, my speaking activities were getting a real boost; being at conferences, speaking with like minded people, sharing but also gaining knowledge, and above all, meeting new people. One thing on these conferences is: Networking. Now as an introvert person, that's a little bit more of a step to take, than when you're not introvert; at least I suppose so. For me it's a huge step to talk to people I see for the first time, but during time, you meet people you've seen before and it becomes a bit easier. Still it remains a challenge for me.
Anyway, the Oracle ACE program helped me giving a boost to my speaking career.
For more information about the ACE program see: https://ace.oracle.com/

From introvert to public speaker

I am naturally an introverted person, but I enjoy speaking in public and sharing my knowledge with others. I appreciate receiving feedback, even if it is critical, as it helps to strengthen my confidence in my abilities. I hope that my story can inspire others to consider speaking in public, and I would be honored to be part of the audience and support them.

https://i.gifer.com/origin/c6/c653cdf2c5df010f4d74503986408205_w200.gif


How organizations can boost their Cloud Native Adoption: The CNCF Maturity Model

Introduction Cloud Native has become important for building scalable and resilient applications in today's IT landscape. As organization...