Adding Smart to your Applications

Leveraging Maps and Location Across Phone, Tablet, and PC in Your Windows Apps

1.0x

Adding Smart to your Applications

Created 2 years ago

Duration 0:43:28
lesson view count 703
Leveraging Maps and Location Across Phone, Tablet, and PC in Your Windows Apps
Select the file type you wish to download
Slide Content
  1. Dr. Harry Shum

    Slide 1 - Dr. Harry Shum

    • Executive Vice President
    • Technology and Research
    • Microsoft Project Oxford:Adding Smart to Your Applications
    • 2-613
    • Ryan Galgon
    • Senior PM
    • Microsoft Project Oxford
  2. Dr. Harry Shum

    Slide 2 - Dr. Harry Shum

    • Executive Vice President
    • Technology and Research
    • Microsoft Project Oxford:Adding Smart to Your Applications
    • 2-613
    • Ryan Galgon
    • Senior PM
    • Microsoft Project Oxford
  3. Dr. Harry Shum

    Slide 3 - Dr. Harry Shum

    • Executive Vice President
    • Technology and Research
    • Microsoft Project Oxford:Adding Smart to Your Applications
    • 2-613
    • Ryan Galgon
    • Senior PM
    • Microsoft Project Oxford
  4. Slide 4

    • What is Project Oxford?
    • A portfolio of REST APIs and SDKs which enable developers to write applications which understand the content within the rapidly growing set of multimedia data
  5. Slide 5

    • Project Oxford’s API services will help you understand and interact with audio, text, image, and video
    • Understand the data around your application
  6. Microsoft Project Oxford Services

    Slide 6 - Microsoft Project Oxford Services

    • PROJECT OXFORD
    • Speech APIs
    • LUIS
    • (Language Understanding Intelligent Service)
    • Vision APIs
    • Face APIs
    • Vision APIs
    • LUIS
    • (Language Understanding Intelligent Service)
    • Speech APIs
    • Face APIs
  7. Slide 7

    • Project Oxford’s models are trained using the same deep learning and machine learning techniques that power many products across Microsoft
    • Powerful models
  8. Slide 8

    • Project Oxford allows you to focus on your application by easily including these services across platforms through simple REST APIs
    • Easy to use
  9. Slide 9

    • Project Oxford allows you to focus on your application by easily including these services across platforms through simple REST APIs
    • Easy to use
  10. Slide 10

    • Project Oxford allows you to focus on your application by easily including these services across platforms through simple REST APIs
    • Easy to use
  11. Slide 11

    • Project Oxford allows you to focus on your application by easily including these services across platforms through simple REST APIs
    • Easy to use
  12. Slide 12

    • Project Oxford allows you to focus on your application by easily including these services across platforms through simple REST APIs
    • Easy to use
  13. Slide 13

    • Easily include Project Oxford Services
    • ProjectOxford.Face.Contract.Face[] detectionResults = new ProjectOxford.Face.Contract.Face[0];
    • ProjectOxford.Face.Contract.IdentifyResult[] identifyResults = new ProjectOxford.Face.Contract.IdentifyResult[0];
    • using (var imageFileStream = Context.ContentResolver.OpenInputStream(imageUri))
    • {
    • //Call detection and identification REST API
    • detectionResults = await client.DetectAsync(imageStream: imageFileStream, analyzesAge: true, analyzesGender: true);
    • identifyResults = await client.IdentifyAsync(personGroupId, detectionResults.Select(face => face.FaceId).ToArray());
    • }
    • ProjectOxford.Face.Contract.Face[] detectionResults = new ProjectOxford.Face.Contract.Face[0];
    • ProjectOxford.Face.Contract.IdentifyResult[] identifyResults = new ProjectOxford.Face.Contract.IdentifyResult[0];
    • using (var imageFileStream = Context.ContentResolver.OpenInputStream(imageUri))
    • {
    • //Call detection and identification REST API
  14. Slide 14

    • OCR
    • Speech Recognition
    • Text to Speech
    • Speech Intent Recognition
    • Determine Entities
    • Microsoft Project Oxford Services
    • PROJECT OXFORD
    • Face Grouping
    • Face Identification
    • Face Detection
    • Analyze Image
    • Generate Thumbnail
    • Improve Models
    • Detect Intent
    • Face APIs
    • Vision APIs
    • LUIS
    • (Language Understanding Intelligent Service)
    • Speech APIs
  15. Vision APIsAnalyze an Image

    Slide 15 - Vision APIsAnalyze an Image

    • OCR
    • Get Thumbnail
  16. Slide 16

    • Understand content and features within an image
    • Analyze Image Service
  17. Slide 17

    • Analyze Image – Example
    • Type of Image:
    • Clip Art Type 0 Non-clipart
    • Line Drawing Type 0 Non-Line Drawing
    • Black & White Image False
    • Content of Image:
    • Categories [{ “name”: “people_swimming”, “score”: 0.099609375 }]
    • Adult Content False
    • Adult Score 0.18533889949321747
    • Faces [{ “age”: 27, “gender”: “Male”, “faceRectangle”: {“left”: 472, “top”: 258, “width”: 199, “height”: 199}}]
    • Image Colors:
    • Dominant Color Background White
    • Dominant Color Foreground Grey
    • Dominant Colors White
    • Accent Color
  18. Slide 18

    • Detect and recognize words within a photo
    • OCR Service
  19. OCR – Example

    Slide 19 - OCR – Example

    • JSON:
    • {
    • "language": "en",
    • "orientation": "Up",
    • "regions": [
    • {
    • "boundingBox": "41,77,918,440",
    • "lines": [
    • {
    • "boundingBox": "41,77,723,89",
    • "words": [
    • {
    • "boundingBox": "41,102,225,64",
    • "text": "LIFE"
    • },
    • {
    • "boundingBox": "356,89,94,62",
    • "text": "IS"
    • },
    • {
    • "boundingBox": "539,77,225,64",
    • "text": "LIKE"
    • }
    • . . .
    • TEXT:
    • LIFE IS LIKE
    • RIDING A BICYCLE
    • TO KEEP YOUR BALANCE
    • YOU MUST KEEP MOVING
  20. Slide 20

    • Scale and crop an image, while retaining key content
    • Smart Thumbnail Service
  21. Smart Thumbnail – Example

    Slide 21 - Smart Thumbnail – Example

  22. Face APIsDetection

    Slide 22 - Face APIsDetection

    • Verification
    • Grouping
    • Identification
  23. Face API – Detection

    Slide 23 - Face API – Detection

    • Detection Result:
    • JSON:
    • [
    • {
    • "faceRectangle": {
    • "width": 109,
    • "height": 109,
    • "left": 62,
    • "top": 62
    • },
    • "attributes": {
    • "age": 31,
    • "gender": "male",
    • "headPose": {
    • "roll": "2.9",
    • "yaw": "-1.3",
    • "pitch": "0.0"
    • }
    • "faceLandmarks": {
    • "pupilLeft": {
    • "x": "93.6",
    • "y": "88.2"
    • },
    • "pupilRight": {
    • "x": "138.4",
    • "y": "91.7"
    • },
    • ...
    • INPUTIMAGE
    • FACIALRECTANGLE + LANDMARKS
    • DETECTION
    • ATTRIBUTES
  24. Slide 24

    • Verification Result:
    • JSON:
    • [
    • {
    • "isIdentical":false,
    • "confidence":0.01
    • }
    • ]
    • Face API – Verification
    • Given two faces, determine whether they are the same person
  25. Slide 25

    • CLUSTERED BY
    • DETECTED PEOPLE
    • Face API – Grouping
  26. Slide 26

    • INPUT IMAGE
    • DETECTION
    • FACIALRECTANGLE + LANDMARKS
    • Face API – Create Person Object
    • COLLEAGUES
    • CREATE PERSON GROUP
    • COLLEAGUES
    • ADD PERSON
    • Chao Wang
    • COLLEAGUES
    • Chao Wang
  27. Slide 27

    • He is Chao Wang.
    • Face API – Identify
    • NEW INPUTIMAGE
    • IDENTIFY
    • Natalie Huber
    • GROUP PERSON OBJECTS
    • COLLEAGUES
    • RECOGNITION
    • Chao Wang
  28. Speech APIs powered by Bing

    Slide 28 - Speech APIs powered by Bing

    • Voice Recognition (Speech to Text)
    • Voice Output (Text to Speech)
  29. Slide 29

    • Voice Recognition
    • Converts spoken audio to text
    • Same backend which powers Cortana
    • Support for 7 languages at launch
    • en-US
    • en-GB
    • de-DE
    • es-ES
    • fr-FR
    • it-IT
    • zh-CN
  30. Duration of Audio

    Slide 30 - Duration of Audio

    • < 15 seconds
    • < 2 minutes
    • Final Result
    • n-best choice
    • Best Choice, delivered at sentence pauses
    • Partial Results
    • Yes
    • Yes
    • ********* Final N-BEST Results *********
    • [0] Confidence=Normal Text="450 six St San Francisco."
    • [1] Confidence=Normal Text="For 50 six St San Francisco."
    • [2] Confidence=Normal Text="456th St San Francisco."
    • [3] Confidence=Normal Text="450 six St in San Francisco."
    • [4] Confidence=Normal Text="456 St San Francisco."
    • Voice Recognition Modes
    • Short Form
    • Long Form
    • 450 6th St. San Francisco
  31. Slide 31

    • Voice Recognition
    • REST API
    • Client Library
    • Any
    • Windows, Android, iOS
    • SUPPORTED PLATFORMS
    • Yes
    • Yes
    • DATA SUPPORT
    • No
    • Yes
    • MIC SUPPORT
    • No
    • Yes
    • SILENCE DETECTION ON MIC
    • Short
    • Short and long
    • LENGTH OF UTTERANCE
    • n-best response back
    • multiple partial results, n-best (short) and multiple phrases (long)
    • NUMBER OF RESPONSES
    • Windows 10 has Speech APIs built in
  32. Slide 32

    • Voice Recognition
    • REST API
    • Client Library
    • Any
    • Windows, Android, iOS
    • SUPPORTED PLATFORMS
    • Yes
    • Yes
    • DATA SUPPORT
    • No
    • Yes
    • MIC SUPPORT
    • No
    • Yes
    • SILENCE DETECTION ON MIC
    • Short
    • Short and long
    • LENGTH OF UTTERANCE
    • n-best response back
    • multiple partial results, n-best (short) and multiple phrases (long)
    • NUMBER OF RESPONSES
    • Windows 10 has Speech APIs built in
  33. Slide 33

    • Synthesize audio from text, to speak to your users.
    • Synthesize audio from text via POST request
    • Maximum audio return of 15 seconds
    • 17 languages supported at launch
    • Voice Output
    • <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xml:lang="en-US">
    • <voice name="Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)">
    • Synthesize audio from text, to speak to your users.
    • </voice></speak>
  34. Slide 34

    • Synthesize audio from text, to speak to your users.
    • Synthesize audio from text via POST request
    • Maximum audio return of 15 seconds
    • 17 languages supported at launch
    • Voice Output
    • <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xml:lang="en-US">
    • <voice name="Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)">
    • Synthesize audio from text, to speak to your users.
    • </voice></speak>
  35. Language Understanding Intelligent ServiceDetermine Intent

    Slide 35 - Language Understanding Intelligent ServiceDetermine Intent

    • Detect Entities
    • Improve Models
  36. Slide 36

    • {
    • “entities”: [
    • {
    • “entity”: “flight_delays”,
    • “type”: “Topic”
    • }
    • ],
    • “intents”: [
    • {
    • “intent”: “FindNews”,
    • “score”: 0.99853384
    • },
    • {
    • “intent”: “None”,
    • “score”: 0.07289317
    • },
    • {
    • “intent”: “ReadNews”,
    • “score”: 0.0167122427
    • },
    • {
    • “intent”: “ShareNews”,
    • “score”: 1.0919299E-06
    • }
    • ]
    • }
    • “News about flight delays”
    • Language Understanding Models
  37. Slide 37

    • Lets you understand what your users are saying
    • Seamless integration with Speech Recognition
    • A few examples are enough to deploy an application
    • LUIS learns over time
    • Language Understanding Intelligent Service
    • Define
    • Concepts
    • Provide
    • Examples
    • Active Learning
    • Deploy
  38. Slide 39

    • Lets you understand what your users are saying
    • Seamless integration with Speech Recognition
    • A few examples are enough to deploy an application
    • LUIS learns over time
    • Language Understanding Intelligent Service
    • Define
    • Concepts
    • Provide
    • Examples
    • Active Learning
    • Deploy
  39. Getting started with Project Oxford

    Slide 40 - Getting started with Project Oxford

    • You can also subscribe directly from the Azure Management Portal Marketplace
    • Visit https://www.projectoxford.ai and select “Sign up” from any of the offered services
    • The Azure Management portal will launch
    • In 'Choose an Application or Service' page, select the service such as "Face APIs" from the list
    • Fill in the requested information to purchase a free tier
    • Find the service under the ‘marketplace’ tab
    • Select the service, and click on ‘Manage’
    • You now have your developer keys, ready to use in your applications
  40. Slide 42

    • Microsoft Project Oxford
    • A portfolio of REST APIs and SDKs which enable developers to write applications which understand the content within the rapidly growing set of multimedia data
  41. Slide 43

    • Visit http://www.projectoxford.ai to learn more
    • Stop by the Project Oxford booth
    • Give us feedback in our forum
    • Don’t miss the Bing talk on the Vision APIs
    • Call to Action
  42. Slide 44

    • Visit http://www.projectoxford.ai to learn more
    • Stop by the Project Oxford booth
    • Give us feedback in our forum
    • Don’t miss the Bing talk on the Vision APIs
    • Call to Action
  43. Slide 45

    • Visit http://www.projectoxford.ai to learn more
    • Stop by the Project Oxford booth
    • Give us feedback in our forum
    • Don’t miss the Bing talk on the Vision APIs
    • Call to Action